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ABSTRACT 

We address a long-standing problem, how can we extract information in the non-Gaussian 
regime of weak lensing surveys, by accurate modeling of all relevant covariances between the 
power spectra and bispectra. We use 1000 ray-tracing simulation realizations for a A-CDM 
model and an analytical halo model. We develop a formalism to describe the covariance matri- 
ces of power spectra and bispectra of all triangle configurations, which extend to 6-point cor- 
relation functions. We include a new contribution arising from coupling of the lensing Fourier 
modes with large-scale mass fluctuations on scales comparable with the survey region via halo 
bias theory, which we call the halo sample variance (HSV) effect. We show that the model 
predictions are in excellent agreement with the simulation results for the power spectrum and 
bispectrum covariances. The HSV effect gives a dominant contribution to the covariance ma- 
trices at multipoles I > 10 3 , which arise from massive halos with masses > 10 14 M Q and at 
relatively low redshifts z < 0.4. Since such halos are easy to identify from a multi-colour 
imaging survey, the effect can be estimated from the data. The bispectra add information to 
the power spectrum, and increase the total information content or the cumulative signal-to- 
noise ratio up to a maximum multipole of a few 10 3 , (SyiV)j max , by 20 - 50% compared to 
that of the power spectrum alone for upcoming surveys, which is equivalent to a factor of 1 .4 
- 2.3 larger survey area for the power spectrum measurement alone. However, the total infor- 
mation content of the power spectrum and bispectrum is still smaller than that for a Gaussian 
field by a factor of 3, mostly due to the HSV effect. Thus bispectrum measurements are use- 
ful for cosmology, but using information from upcoming surveys requires that non-Gaussian 
covariances are carefully estimated. 

Key words: cosmology: theory — gravitational lensing — large-scale structure of the uni- 
verse 



1 INTRODUCTION 

The accelerated expansion of the Universe is the most tantalizing problem in modern cosmology. Within Einstein's gravity theory, General 
Relativity, cosmic acceleration can be explained by introducing dark energy, which acts as a repulsive, rather than attractive, force on 
the expansion of th e universe. Alternativ ely, cosmic acceleration may be a signature of the breakdown of Einstein's gravity theory on 
cosmological scales I Jain & Khourvllioioi for a review). Many on-going and upcoming wide-area galaxy surveys aim at testing dark energy 
and modified gravity scenarios as the origin of cosmic acceleration. In particular, ongoing and upcoming multi-colour imaging surveys 
include: the CFHT Weak Lensing Survey q the Panoramic Survey Telescope & Rapid Response System (Pan-STARRsH), the VST Kilo- 
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Degree Surve)[f[ the Subaru Hyper Suprime-Cam Survey jlVlivazaki et alj 2006H the Dark Energy Survey (DES0), and in the next decade, 
the Large Synoptic Sky Survey (LSSlJj), the ESA Euclid satellite missiorQ, and the NASA WFIRST satellite mission^. 

Among different cosmological probes, weak gravitational lensing or cosmic shear is recogni zed as one of the most promising methods 



for constraining the nature o f dark energy, provided systematic errors are well under control (see Bartelmann & Schneider 2001 



Schneider 



2006; Hoe kstra & Jainll2008l . for reviews). The bending of light rays emitted from a distant galaxy due to the foreground mass distribution 



causes the image to be distorted. The distortion signal is too weak for us to measure in single galaxies, but we can use a sufficiently 
large number of galaxy images, available from wide-field survey, to detect the correlated shear signals existing in-between different galaxy 
images. Weak lensing is a unique method of measuring the total matter distribution including dark matter, free of galaxy bias uncertainty, 
and allows a direct comparison of the measurement with theory that is in most case about the statistica l properties of the dark matter 
distri bution. The theoretical predictions are obtaine d using N-body simulations (e.g., Springel et al. 2006h and/or analytical approaches 
(e.g., Bemardgau_etaL 2002; Coora y & Shethl 2002). The cosmological weak lensing signal has been measured by several groups (e.g., 



Bernardeau et al 
Hamana et al.l2003l ; 



Hoekstra & Jain 2008, and also see references therein), and we are waiting for measurements with much higher statistical 



precision from upcoming surveys. 

Most previous works on weak lensing, in theoretical and observational studies, have focused on the shear two-point correlation function 
or equivalently its Fourier-transform, the power spectrum as the measurement method of weak lensing. The two-point correlation function 
contains the f ull information contai ned in the weak lensing field, only if the lensing field is Gaussian as in the cosmic microwave background 
(CMB) field dKomatsu et alj|201 lh . However this is not the case in reality. Nonlinear clustering in structure formation causes a coupling 
between different Fourier modes, and the mass density field at redshifts relevant for galaxy surveys is non-Gaussian. Since the lensing field is 
related to the mass density field via projection, the lensing field is also non-Gaussian. Hence, the power spectrum is not sufficient to extract 
the full information content of the lensing field. In f act, various studies have shown that the information content carrie d by the power spectrum 
might be saturated at multipole scales o f a few 10 3 ; lHamiltonetail2 006; Taka hashi et alT2 009; Nev rinck et a"il2 009. for the 3D m ass density 
field) l Sato et alj2009l ; Seo et alj2011 , for the lensing field), and jLee & Penl2008l , for the result from the actual data). In particular. lSato et al] 
(2009) used 1000 ray- tracing simulation realizations for a ACDM model to study the power spectrum covariance and the information content 
of the power spectrum. They found that the information content is reduced by a factor of 2 at multipoles I ~ 10 3 compared to the Gaussian 
case for a survey with typical source redshift z s ~ 1. Further, they showed that large-scale mass density fluctuations of scales outside the 
simulation area cause a significant contribution to non-Gaussian terms of the error covariance. They developed a formalism to describe the 
new non-Gaussian errors via th e number fluctuations of massive halos base d on halo bias theory, which we hereafter call the halo sample 
variance (HSV) effect (also see lHu & Kravtsovll2003l ; lTakada & Bridl3l2007t) . 

Some fundamental questions remain unresolved: How important and useful are the non-Gaussian signals in the lensing field for cosmol- 
ogy? Which statistical method to extract the non-Gaussian signals is most useful? Can we recover the Gaussian information content, which 
should have existed in the linear field or the primordial field, by combining the power spectrum and the non-Gaussian signals? For weak lens- 
ing, there is additional expectation that the non-Gaussian signals will be useful for cosmology, because the skewness for example has been 



shown co mplementary to the power spectrum in its dependence on cosmological parameters l Bernardeau et al. 199 - / ; Jain & Seli 



Hui 1999; Jain et al. 2000; White & Hu 2000; Hamana & Mellier 2001 ; Van Waerbeke et al. 2001 ; Cooray & Hu 2001b; Takada & Jain 200: 
2004l : lDodelson & Zhang|2005l : lKilbinger & SchneideJ2005l : ISemboloni et alj2008r . lBerge et al.l20ld : lMunshi et all201 lMPires et alkoi 



The attempt to measure the non-Gaussian signals from actual data was also made by several groups iBernardea^ietal* 20021 : Zhang et al. 
20031 : Jarvis et al although the detection was not significant yet. 

In this paper, we study the lensing bispectrum - the Fourier-transformed counterpart of the three-point correlation function. It contains 
the lowest-order non-Gaussianity of the weak lensing field and is a natural extension of the two-point correlation function. We consider all 
the triangle configurations available from a given range of multipoles and include the full covariance matrix including the non-Gaussian error 
contr ibutions up to the 6-point correlation functions. Gaussian errors were assumed in most previous work i Takada & Jainl2004 ; Martin et al 



2012). We use the 1000 simulation realizations to study the usefulness and complementarity of the lensing bispectrum compared to the power 



spectrum, and also develop an analytical formula to describe the bispectrum covariance for a given cosmology. In particular, we will show 
that the halo sample variance gives a significant contribution to the bispectrum covariance at I > a few 10 , and that the bispectrum does 
carry additional information to the power spectrum even in the presence of these significant correlations. Thus we will give a quantitative 
answer to the fundamental questions above. 

This paper is organized as follows. After briefly reviewing the lensing power spectrum and bispectrum in §[2] we develop a formulation 
to describe the bispectrum covariance in §[3] In §|4] we show the main results: we study the bispectrum covariance using both simulations and 
analytical model predictions. We quantify the information content of the lensing bispectrum by including contributions from all the triangle 
configurations. §H]is devoted to discussion and conclusion. 
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2 PRELIMINARIES : LENSING POWER SPECTRUM AND BISPECTRUM 

In the context of cosmological gravitational lensing, the c onvergence field is expressed as a we ighted projection of the three-dimensional 
density fluctuation field between source and observer (see Bartelmann & Schn eider 2001 ; Schneiderll2006l for a thorough review) 



k(0)= / d x w GL (x)S[ x ,xe], (i) 



where 9 is the angular position on the sky, \ is the comoving distance, and \h is the distance to the Hubble horizon. Note that we assume a 
flat geometry throughout this paper, and the radial distance \ is equivalent to the comoving angular diameter distance. The comoving distance 
X(o) from an observer at a — 1 to a source at a is expressed in terms of the Hubble expansion rate H(a) as x( a ) = J da' /[H(a')a 2 ]. For 
a single redshift of source galaxies, the lensing efficiency function Wgl(x) is defined as 

WgUx) = IffoiW- 1 * ( 1 - ^ J , (2) 



where \s is the distance to the source galaxies. When including tomographic binning of source redshifts, see Eq. (4) in Taka da&Jainl 12009) 



for the lensing efficiency functions for different redshift bins. Under the flat-sky approximation, the Fourier transform of the lensing field is 
defined as 

For a finite sky cover age of a survey, we need to use the discrete Fourier decomposition, rather than the infinite-range Fourier decomposition 
(also see Appendix in Takada & Bridle l2007i for the details). 



In this paper, we study the bispectrum of lensing field and the covariance. The bispectrum is the Fourier-transformed counterpart of 
the three-point correlation function. The n-point power spectra relevant for the bispectrum covariance are defined in terms of the ensemble 
averages of the convergence fields in Fourier space as 



{k h k l2 ) = (2nyP(h)5Uh+h), (4) 

(k^k^K!,) = (2n) 2 B(l 1 ,l 2 ,l 3 )S 2 D (h+l 2 +h), (5) 

(«i 1 «j 2 «i 3 «J 4 )c = (27t) 2 T(Ii, h, h, h)S 2 D (h + h + h), (6) 

(k li k l2 ...k l J c ee (2n) 2 P n (l 1 ,l 2 ,...,l n )8 2 D {l 1 +l 2 + ---+ln) ifn>5, (7) 

where <5fj(Z) is the Dirac delta function; P, B and T are the lensing power spectrum, bispectrum and trispectrum, respectively; P n is the 
n-point power spectrum. For the bispectrum covariance, we need to include up to the 6-point power spectra Pq. Note that the higher-order 
correlation function than the bispectrum is the connected part of the n-point function, which characterize non-Gaussianity of the lensing 
field and cannot be expressed in terms of products of the power spectrum (and the lower-order correlation functions). Exchange symmetry of 
wavevectors, Zj o lj, reflects that the n-point correlation functions are invariant under permutations of the arguments. In addition, the delta 
function enforces translation invariance; adding constant vector into each argument vector I t does not change the n-point power spectrum. 
In other words, the vectors of l\, I2, ■ ■ ■ , In form the closed n-point polygonal-shape in Fourier space, i.e. Zi + li + • • • + l n =0. 

The lensing power spectrum and bispectrum can be given as the weighted line-of-sight proje ction of the t hree-dimensional power 
spectrum and bispectrum of the underlying mass distribution. Employing the Limber's approximation jLimberll954h and the flat-sky approx- 



imation, the lensing power spectrum and bispectrum are expressed in terms of the three-dimensional mass spectra as 

P(l) = £ H d X W^(x)x~ 2 Ps (k= ^;x) , (8) 



XH 

3 / \ -4 



B{hM,h) = / d X W G ^( X )x "*fl«(*i,*2,fc 8 ;x)lfc 1= l. /x , (9) 
Jo 

where P5 and B5 are the power spectrum and bispectrum of the matter distribution at each redshift x(= x( 2 ))- Thus once the n-point spectra 
of the matter field are given for a given cosmological model, we can compute the n-point spectra of the lensing field. The above equation 
also means that statistical properties of the lensing spectra arise from those of the mass density field, since the prefactor functions such as the 
efficiency function Wgl{ X ) is a pure geometrical factor, and not a statistical variable. 

In reality, the power spectrum measurement is affected by the intrinsic shape noise due to a finite sampling of source galaxy shapes: 

p obs (0 = f(0 + — , (io) 

Tig 

where <r e is the rms of intrinsic ellipticities per component, and n 9 is the mean number density of source galaxies per unit steradian. In the 
following, we will often omit the notation obs to refer P° hs {l) for notational simplicity. The bispectrum is a measure of non-Gaussianity in 
the lensing field, so is not affected by the random shape noise. 
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3 LENSING COVARIANCE 



The covariances of the lensing spectrum and bispectrum describe measurement accuracies of the spectra for a given survey. There are several 
sources of the measurement errors: the shot noise arising due to a finite sampling of galaxy shapes and the sample variance arising due to a 
finite survey area. If the lensing field is Gaussian, the different Fourier modes with I 7^ I' are independent, and therefore the sample variance 
is determined by the number of independent Four ier modes fo r a given multipole bin I that are available from the survey, yielding a simple 
formula of the sample variance contribut ion (e.g. Knoxl|l995t) . However, this is not c ase for the lensing field, because the lensing field is 
highly non-Gaussian at scales of interest Takada & Jainll 20041 12009; Sato et al. 2009), and the different Fourier modes are correlated with 
each other. In the following, we discuss theory for the lensing covariance matrices. 



3.1 Power spectrum covariance 

The power spectrum covariance has been w ell studied by previous works 1 Scoccimarro et al. 19991 : Coorav & Hij 200 lal ; Takada & Bridle 
2007l : lTakada & Jainl2009l : ISato et alj[2 009). In particular, [SatoetaL | j2009h derived an expression of the power spectrum covariance includ- 
ing a new contribution from the large-scale mass density fields of scales outside survey area, and showed that the formula can well reproduce 
their ray- tracing simulation results. According to this work, the power spectrum covariance is given as 



Cov[P(i i ),P(Z J -)] = 



■A^ pairs (/) 



P{U) + — 



1 

+ a 



d'H 



d 2 i' 



\i\ &i A (MJ\i'\ &j Mii) 



T(l,-l,l',-l') + CovMv(U,l 3 -Ms), 



(11) 



where 8f : l . is the Kronecker delta, . 



1 if h 



lj within the bin width and otherwise 5 ; A ; . 



0; O s is the survey area in units of 
steradian; A(li) is the area of the above integration in Fourier space, given as A s (li) = Jm si . d 2 l, where the integration range is confined to 
the wavevectors satisfying the condition h — Al/2 ^ \l\ ^ U + Al/2 (Al is the bin width around the i-th bin, U)\ the third term on the r.h.s., 
denoted as Covhsv, is the new contribution which we call the halo sample variance (HSV) contribution (see below). The quantity -/V pa i rs (7;) 
is the number of independent pairs of two vectors I and — I in Fourier space, where the vector I has the length U to within the bin width and 
"independent" here means different pairs discriminated by the fundamental Fourier mode of a given survey, If ~ 2n/Q s (Q s is the angular 
scale of the survey area). At the limit U 3> If, A{U) ~ 2nUAl and the number of independent Fourier modes is given as 

2nkAl Q, s liAl 



-^Vpairs {J'i ) 



(27t/e s 



2tt 



= 2/ s kyZiAZ, 



(12) 



where f E ^ y is the sky fraction defined as / s k y = fi a /47T. See iTakada & Bridle! ( 120070 for a pedagogical derivation of the power spectrum 
covariance based on the discrete Fourier decomposition formulation (except for the third term Covf^fy). In Eq. i ll It . we ignored effects of 
non-trivial survey geometry for simplicity. 

The first and second terms on the r.h.s. of Eq. il lb are standard covariance terms studied in most previous works. The first term describes 
the Gaussian covariance term that vanishes when U 7^ lj, i.e. no correlation between different multipole bins. The second term gives a non- 
Gaussian term arising form the lensing trispectrum (4-point correlation function), which describes the mode coupling between different 
multipole bins. If the lensing field is Gaussian as in the CMB field, the covariance has only the Gaussian term (the first term). Both the two 
terms scale with survey area as 1/Q S \ the amplitudes decrease with increasing the survey area. It should also be noted that the Gaussian term 
depends on the multipole bin width, while the non-Gaussian terms not; a larger bin width relatively reduces the Gaussian term contribution 
at the multipole bin. 

The third term of Eq. i ll It arises from the mode coupling of the Fourier mode of our interest, U, with large-scale modes of scales 
comparable with or even outside the survey region. Such large-scale modes cannot be seen by an observer, but affect the power spectrum 
estimation as follows. For example, if the entire survey region happens to be in the overdensity region in the universe, which can be caused 
by the large-scale mass density fluctuations, the number of massive h alos become to be larger than the ensemble average according to the 
halo bias theory, and vice versa jMo&Whitelll996l : Sheth et al. 2001). Thus the number of massive halos fou nd in a finite survey volume 
becomes correlated with the mass density fluctuations of scales comparable with or larger than the survey field dHu & Kravtsovi2003r) . To be 
more explicit, the number fluctuations of halos in the mass M and in the redshift slice centred at z are given as 



8N(M) = b(M)-f^-n B Az-^-5 m (Q s 
dzdil dm 



(13) 



where d 2 V / dzdQ, is the comoving volume per unit redshift interval and per unit sold angle, d 2 V/dzdQ. = \ 2 f° r a Hat universe; dn/dM 
is the ensemble-averaged mass function of halos in the mass range [M, Al + dM]; b(M) is the halo bias parameter; 5 m (Q a ) is the mass 
density fluctuation averaged within the survey volume in the redshift slice which has area f2 s and the redshift width Az. The lensing power 
spectrum amplit udes at small angular scales are sensitive to the number of massive halos in the survey region, as predicted in the halo model 
picture 1 Takada & Jai n 2003a b). Hence the lensing power spectrum amplitude becomes correlated with the large-scale density fluctuations 
via the number fluctuations of massive halos. The ensemble average of the power spectrum is not affected by the large-scale mode due to 
the fact ^<5 m (O s )) = 0. However, the correlation adds statistical scatters to the power spectrum measurement and therefore contribute to the 
power spectrum covariance, which is the HSV. At the limit U, lj > 1 we are interested in, the HSV contribution is given as 
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CoV H gv(ii, lj 



" ( d 2 v\ 

X \ dQ,d x ) 



dM^b{M)\k u \ 2 



dM> v 'I 1,1 



D ^-p^(k)\w(k x e s )\ 2 



(14) 



where W(l) is the Fourier transform of the survey window function; P^ is the linear mass power spectrum ; Ki(x) is the Fourier transform 
of the convergence fi eld for which we assu me a Navarro-Frenk-White (NFW) halo iNavarro et al. 1997 ) (see Eq. 28 in Oguri & Ta kada 
201 1 or Section 3.2 in iTakada & JairiE otBb for the expression of Ki). Note that, for notational simplicity, we omitted to denote the redshift 
dependence of dn/dM, b(M) and ki. In this paper, we simply consider the window function given by W(x) — 2Ji(x)/x, which corresponds 
to a circle-shaped survey geometry whose radius is S . F or the halo mo del ingredients (dn/dM, b(M) and the NFW profile), we will 
throughout this paper employ the same models as used in Takada & JairJ 1 2009r) . Roughly speaking, the HSV term can be expressed as 



Covf 



~ Pih(h) Pih(lj)o'^ n (Q B ) , where P\h is the 1-halo term of the lensing power spectrum and a m (Q B ) is the rms of the projected 
linear mass density fluctuations smoothed with the angular scale of the survey area. As implied from the above equation, the HSV affects 
the band powers of different multipoles in the same way, and does not change the shape of the power spectrum. This HSV term cannot be 
realized as long as the discrete Fourier decomposition is used for deriving the power spectrum covariance, because the large-scale modes 
outside the survey region cannot be described by Fourier modes confined inside the survey region. 

Another important feature is the HSV contribution depends on the survey area via the integration of the linear mass power spectrum, 
J kdkP^ik) \W{kx®s) | unlike the other terms that scale as l/fi s . For a power-law linear power spectrum, P^(k) oc k n , the HSV term 



is found to scale as Cov HSV oc 1/(S1 S ) 



l + n/2 



Hence, for n < 0, which is indeed the case at k ^ k e „ (k a „ is the wavenumber of the 



matter-radiation equality), the HSV amplitude more slowly decreases with increasing the survey area than the other terms do. On the other 
hand, when n > or k ^ fc cq , the HSV term more quickly decreases with increasing the survey area. Thus the HSV term depends on the 
surv ey area in a non-tr ivial way, and we will study the relative importance of the HSV term for different survey areas. 

Sato et al] i2009h showed that the HSV term gives a dominant contribution to the power spectrum covariance at large multipoles, 



I > 10 3 , over the first and second terms in Eq. il lb . They showed that, once the HSV contribution is included, the model prediction given by 
Eq. dl lb gives remarkably good agreement with the ray-tracing simulations for different multipole bins and a wide range of source redshifts. 

There are even other sources of non-Gaussian errors arising from a correlation of the lensing field in the weakly nonlinear regime with 
the mass density fluctuations of scales comparable with or larger than the survey area. This contribution can be formulated based on the p ertur- 
bation theory of structure formation, which is valid for the mass density field in the weakly nonlinear regime. iRimes & Hamilton (2005J first 
studied this effect for the 3D mass power s pectrum, and n amed this new contrib ution the beat-coupling mode (also see Hamilton et al]| 20061 ; 
Sefusatti et alJl200^ : Takahashi et al. 2009). Furthermore, de Putter et al. i 2012n recently pointed out that the large-scale density fluctuations 
cause an apparent modulation in the mean density estimated from a finite survey region, and cause even additional negative contribution to 
the covariance. This term was shown to have a similar amplitude to the beat-couplin g mode. Th ese large-scale mode contribution, which is 
relevant for the weakly nonlinear regime, differs from the HSV effect, and Takada & Jainl 1 2009) showed that the contribution to the lensing 
power spectrum covariance is negligible compared to the non-Gaussian errors arising from the 4-point function (the second term in Eq. ll lb at 
multipoles of / > a few 10 2 . Hence, in this paper, we ignore the non-Gaussian errors arising from the mode-coupling in the weakly nonlinear 
regime for simplicity. 



3.2 Bispectrum covariance 

As we discussed above, the lensing bispectrum is given as a function of triangle c onfigurations. An estima tor of the lensing bispectrum from 
the finite-area lensing survey can be found, by extending the method developed in lTakada & Bridld ( 12007!) (see Appendix|A]for the details), 
as 

&ih ' h ' h) = V B N t Jh,l 2 ,l 3 ) E k 1^ A Q^ (h, fa, 'a), (15) 

where q 123 = q 1 + q 2 + q 3 and the summation runs over all the pixels of q 1 , q 2 and q 3 . The function Aq l2i denotes the selection function 
defined so that it gives unity if each vector has a target length such as U — AU /2 ^ qi ^ U + Ah/2 (i = 1,2,3) and the three vectors form 
the triangle configuration in Fourier space, q 123 = 0; otherwise Aq 123 (h, I2, h) = 0. The quantity A^rip is the number of the triplets in 
Fourier space that form a given triangle configuration specified by three side lengths h,h, h within the bin widths. This is calculated in terms 
of the selection function as A'trip = ^2q ^<7i 2 3 1 ' ' 2 > ' 3 )- P re f actor is from our definition of the discrete Fourier decomposition 
(see lTakada & Bridlell2007h. 



As derived in Joach imi et all (2009) and we derive in Appendix[A] for the limit of large multipole bins, h, I2, h~S>lf, we can analyti- 
cally estimate the number of independent combinations of triplets (q x , q 2 , <j 3 ) satisfying a given triangle configuration (where we meant by 
"independent" the triplets are discriminated by the fundamental mode of a given survey area): 



v , , , V* a njhhhAhAhAh 

iVtripUi, (2, 43) = > Aq ~ , (16) 

^ 9123 2^^ + 2^1 + 2^1-^-^-/1 
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(i) Gaussian term: 0(PPP) (ii) 0(55) 




(iii) 0(PT) (iv) 0(P 6 ) 

Figure 1. Illustration of the different terms of the bispectrum covariance. The triangle configurations for two bispectra in the covariance matrix, B(li, I2, £3) 
and B(l' lt l' 2 , tg), are specified by sets of the three vectors (Zi, 1%, Z3) or (l' lt 1' 2 , l' 3 ), denoted by the solid and dashed lines, respectively. The three vectors 
satisfy the triangle conditions Zi + I2 + I3 =0 and Zj + 1' 2 + l' } = 0. (i) The Gaussian part of the bispectrum covariance, which arises only if the two 
triangle configurations have the same shape (within the coarseness of the bin widths). Hence the Gaussian term amplitude is proportional to P(h)P(l2)P(l:i) 
and contributes to the diagonal terms of the covariance matrix. The two vectors with the same length such as Zi and Z' x and in the opposite directions yield the 
power spectrum after the ensemble average, (ii) One non-Gaussian part of the bispectrum covariance which arises if one side of the two triangles is in the same 
length and in the same direction; here l\ = Z' x is shown as an example. The amplitude of this covariance term is proportional to -B(Zi, Z2, ?3)i3(i' 1 , 1' 2 , (3). (iii) 
A second non-Gaussian part that arises if one side length of the two triangles is the same but in the opposite direction, Zi = — Ii. The amplitude is O(PT); 
in particular, it is given by P(Zi)T(Zi, I3, 1' 2 , 1' 3 ) in this case. The surrounding four vectors (i2 : ^3 > ^2 > ^3 m m ' s case ) form a quadrangular configuration 
satisfying the condition l 2 + Z3 + 1' 2 + Z 3 = 0, which gives the trispectrum contribution, (iv) A third non-Gaussian part which arises for generic triangle 
configurations and therefore contributes to all diagonal and off-diagonal terms of the covariance matrix. The amplitude is proportional to the connected part 
of the 6-point correlation function, Pq. As indicated, the 6 vectors of (Zi, Z2, 13, Z 2 , Z 3 ) arise from the two triangles that form the 6-point configuration in 
Fourier space, although the two vertices of the 6 points are collapsed to one point due to the triangle conditions. 



where AU is the bin width of the i-th side length. For the small-angle scales U 2> 1 (flat-sky approximation limit) we are interested in, the 
equation above gives a good ap proximation to theWigner- 3j symbols that appears in the bispectrum covariance derived under the full-sky 
approach (see around Eq. 16 in Takada & Jainl J2004l) ). 

The bispectrum covariance can be similarly defined as 

Cov[B(li,l2,ls),B(li,l2,i3)] = {B{hMM)HliAA)) - B{h,h,h)B{A,l' 2i lz) 



B{h,h,h)B{l' u h,l' 3 ). (17) 



Thus the bispectrum covariance arises from the 6-point correlation function of the lensing field, and is given as a function of the two different 
triangle configurations for the two bispectrum, B(li) and B^), respectively. 

We present the detailed derivation of the bispectrum covariance in AppendixfA]based on the discrete Fourier decomposition formulation. 
Here we just give the expression of the bispectrum covariance, which has three contributions; the Gaussian and non-Gaussian errors and the 
HSV contribution, respectively: 



Co* [fl(J x ,J 2 , 1 8 ), B(Ji, 



C0VGauss + C0V NG + C0V HSV 
A r trip(^l, 12, '3) ^ ^ ^ ^ ^ 



crK cK cK 1 ril r-fi rfi , £^ i n 

^I'^hi'^hi's + thi'^hi'Jhit, + Shi'Jhi'^hi's +3 perms 



: K s-K s-K 



K s-K S K 



+^-B(h,l 2 ,l 3 )B(l' 1 ,l' 2 ,l' 3 ) 



+ 



if- 



+ 7 perms. 



271 



hl in a hAh 



P{h)T{l 2 M,l'2A) + C 



2tt 



2 n s iiAii 



P(/i)T(l2,l3,li,I 3 ) + 7perms. 



2n 



(18) 



where the notation "NG" stands for the "non-Gaussian" error contribution, and Pa denotes the connected part of the 6-point correlation 
function. Fig. Q] shows a diagram picture of these covariance terms from the first to the fourth lines on the r.h.s. of Eq. d!8t . When further 
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Figure 2. Left panel: The plot shows how halos of different masses contribute to the halo sample variance (HSV) terms of the bispectrum covariance for a 
given equilateral triangle configuration. We used the halo model expression (Eq. ll9t to compute the fractional contributions. For the triangle with side length 
/ = 1000, halos with masses > 10 14 A/q dominate the contribution to the HSV. Middle: Similar plot, but for the redshift distribution of the bispectrum 
covariance. For the triangle with I = 1000, halos at z < 0.4 give the dominant contribution. Right: The survey area dependence of the HSV contribution for 
the power spectrum and the bispectrum, relative to that of the Gaussian error, which scales as CovQauss <x 1 / r2 s . Here we consider the scale of I = 2000 for 
P(l) and B ctl (l), employ the flat sky approximation, and assume the survey geometry given by Q s = 7X0^. 



including the intrinsic shape noise contribution, we just replace the power spectra in the terms of the above equation (0(P 3 ) and O(PT) 
terms) with the power spectrum including the shot noise contribution (Eq.llOt, 

The terms of the first line on the r.h.s. are the Gaussian covariance terms, which contribute only to the diagonal terms of the bispectrum 
covariances. The combination of the Kronecker deltas <5, x ,, 5?,, <5, is non-vanishing only if the two triangle configurations are in the 

1 1 2 2 3 3 

same "shape" within the bin widths. In particular, if triangle configurations have symmetry such as isosceles or equilateral triangles, the 
combination of Kronecker deltas (the terms in the square bracket on the first line) yield a factor of 2 or 6 for isosceles and equilateral triangles, 
respectively. The factors account for the fact that different triangles transformed by parity and permutation transformations (U <h> are not 
independent for a statistically homogeneous and isotropic field. For a general triangle configuration U 7^ lj , the factor becomes unity. The 
prefactor Nt x ip{li, ^2,^3) in the first term is given by Eq. l !16t . 

The terms in the second- and to the fourth lines on the r.h.s. of Eq. j 1 8t are the non-Gaussian error contributions, which arise from the 
higher-order correlation functions of the lensing field. The coefficient of each term such as 2n/(Q. s h AZi) is from the number of independent 
configurations in Fourier space that form a given configuration of 6 wavevectors (h, h, I3, l' ly £3) in Fig.Q](also see Appendix lAl for the 
mathematical derivation). The angular integration in the term including Pe is over the angle ip between the vectors l± and l[ in order to 
include contributions over all the possible 6-point configurations in Fourier space in Fig. Q] To be more precise, for a given element of the 
bispectrum covariance matrix, each of the two triangle configurations for the two bispectra is fixed, and the 6-point configuration can be 
parametrized by the angle tp. Note that all the terms in the first, second and third lines on the r.h.s. depend on the multipole bin widths such 
as Aii, while the term including Pq and the HSV do not depend on the bin widths. 

As in the power spectrum covariance (Eq. 1 1 It. the number fluctuations of massive halos due to the large-scale mass fluctuations affect 
the bispectrum estimated from a finite area survey. Similarly we can find that the HSV contribution to the bispectrum covariance to be given 
as 



Cov[B K («l,«2,«3),B«(Zi J «2 I «3)]HSV = 



dM—b{M)k M {h)KM(l2)K M {h) 



dM 'j^FJ b ( M ') k M'(l'l)RM'(l2)^M'(l'3) 



^pt(k; X )\W(k X e s )\ 2 
In 1 1 



(19) 



The HSV contribution to the bispectrum covariance, more generally the higher-order covariance, has not been realized in previous works. By 
comparing with ray-tracing simulations, we will show below that the HSV contribution is dominant over other covariance terms at I > 1000, 
in the nonlinear regime, and adding the HSV term to the model predictions significantly improves agreement with the simulation results. 
Again notice that the HSV term scales with the survey area in a non-trivial way via the linear mass power spectrum (see the discussion below 
Eq.ll4b. while the other terms scale with survey area as l/f2 a . 

The left and middle panels of Fig. [2] show which halos of mass and redshift range contribute to the HSV term of the bispectrum 
covariance for an equilateral triangle configuration of a given side length. For the triangle with I = 1000, halos with masses > 1O 14 M0 
and at redshift z < 0.4 give a dominant contribution to the HSV effect. These halos are relatively easy to identify from a concentration of 
galaxies on the sky, X-ray or the Sunyaev-Zel'dovich effect. In other word, identifying such massive halos in a survey region and comparing 
the number with the expected number for a fiducial co smological model will help understand the HSV effect for the weak lensing observables 
in the survey region (also see iTakada & Bridlell2007l for the similar discussion). The right panel shows how the HSV terms scale with the 
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survey area, in comparison with that of the Gaussian covariance term for the power spectrum covariance and the bispectrum covariance for 
equilateral triangles. Here we considered the particular multipole bin I = 2000; P(l) and B cq (l), and assumed the flat-sky approximation 
and the survey geometry Q, B = tcO 2 for simplicity. If the curves are constant or independent of Q, B , it means that the HSV term scales as 
Covhsv oc 1/fis as in the other covariance terms. Hence the plot shows that, with increasing Q, s the HSV term more slowly decreases for 
fi s < a few 100 deg 2 or the larger survey area than the other terms do. 



3.3 Cross-covariance between power spectrum and bispectrum 



Since the power spectrum and bispectrum are not independent for the non-Gaussian field, we need to account for the cross-covariance 
between the two observables in order not to double-count the information content. 
Similarly, we can derive the cross-covariance (see Appendix iBt: 



Cov [P cst (l),B cst (h,h, la)] = Cov£g + Cov£# v 



= S, 



47T 



dil; 



l h njiAh P(i)B(l, h, Is) + 2 perms. + — / ^ P 5 (l, -I, h, h, ly, i>) + Cov H f v , (20) 

where P5 is the 5-point correlation function, and tj) is the angle between the vectors I and l\ as in Eq. J 1 8t - The HSV term is similarly given 
as 

2 



Cav[P K {l),B K {h,h,h), 



IISV 



(—)' 

\dxdQJ 



dm^b(M)\k M (l)\ 2 



dn 



dM — b(M')k M , (h)R M > {h)h M > {h) 



kdk 
~2n 



PL{k;x)\W{k X Qs 



(21) 



We will use these equations when computing the total information content for a combined measurement of the lensing power spectrum and 
bispectrum for a given survey. 



3.4 Halo sample variance contribution to the ro-point correlation function measurement 

As we have discussed, the HSV effect contributes to the lensing spectrum covariances (see Eqs. I 1411 19l and [2Tb . All the equations have similar 
forms at the small-angle limit, where the 1-halo term is dominated in the halo model picture. Extending these findings, we can find the HSV 
contribution to the covariance matrix between any n- and n'-point correlation functions in Fourier space: 

2 



Cov[P„(Zl, I2, ■ ■ ■ , In), Pn' (il)'2) • • • j l n ' ) 



dX 



dn 



(—)' 



dM—b(M)K M (h)KM(h) ■ ■ ■ k M (l n ) 



dM—b(M')R u ,(l[)K M ,(l' 2 ) ■ ■ ■ ft M '(C) 



^-P^(k; X )\W(k X Q s )\ 2 .(22) 



Roughly speaking, the amplitude of the HSV term simply scales as Covhsv ~ P„ h (h) P^ (^)°"m(Q s) • Thus any n-point correlation 
functions at small angle scales can be affected by the large-scale mass fluctuations of scales comparable with or outside the survey area. 

We should also emphasize that the HSV contribution affects a measurement(s) of any two- or three-dimensional correlation functions 
from a finite area survey, and can b e very important if one is interested in the small-scale signals which are sensitive to the abundance of halos 
in the finite survey region (e.g., see. Shaw et aljEool i lzhang & Shefhkool a similar discussion on the SZ power spectrum measurement). 



3.5 Halo model predictions for the lensing covariances 



As we have described up to the preceding section, the power spectrum and bispectrum covariance calculations require to compute the 4-, 5- 
and 6-point correlation functions in addition to the power spectrum and bispectrum. For the power spe ct rum and bispectrum, there are some 
efforts developing secu re model predictions in combin a tion with simulations, e . g., Smith et alj ( 120030 ; I Valageas & Nishimichil J201 lah for 
the power spectrum and Scoccimarro & Friemanl 1 19991) ; Valageas & Nishimichil J201 lbh for the bispectrum. The h igher-order functions are 
however, fairly uncertain because there are fewer studies to compare the model predictions with simulations (e.g. lTakada & Jain 20oJ for 
an attempt to compute the kurtosis which is the collapsed 4-point function), partly because the higher-order correlations require a substantial 
amount of computational costs. Instead of pursuing a reliabl e model for t he higher-order correlat i on functions, in this paper we employ the 



halo model approach t o compute the higher-order functions JSeliaklbood : IPeacock & Srnlthlboool ; iMa & Frvi 2000 ; IScoccimarro et al.ll200ll 



Coorav & Shethl 2002 ) in which the correlations of the mass distribution are expressed as two separate contributions: correlation of dark 
matter particles within the same halo, and correlations between particles in different halos. We have found that, up to the 4-point correlation 
functions, the halo model predictions are accurate at 10 — 30% level in the amplitude compared to N-body simulations (in particular for 
lensing fields; iTakada & Jaiij2002L [iooj^fbh ■ Since our purpose of this paper is to assess the importance of non-Gaussian error contributions 
to the bispectrum covariance matrices, we consider that the halo model approach is adequate enough. 
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We know that most of the lensing information come from the small angle scales in the nonlinear clustering regime to which the 1-halo 
term, the correlation arising from the same halo, provid es a dominant con tribution. In addition, the non-Gaussian errors are important only at 
the small scales, as can be explicitly found from Fig. 5 in Sato et aljj200gb . For these reasons, we include only the 1-halo terms to compute the 



non-Gaussian error contributions to the lensing covariances, which significantly simplifies the computation. Although the n-point correlation 
function depends on n wavevectors such as Pn(li, I2, ■ ■ ■ , In), the 1-halo term does not depend on any angle between the vectors, but rather 
depends only on the length of each vector; P^ h (li , I2, . . . , In) = Pn h (h,h, ■ ■ ■ ,ln), reflecting spherical mass distribution around halo in a 
statistical average sense. To be more explicit, assuming the Limber 's approximation, the 1-halo term of the n-point correlation function can 



be computed as follows (see around Eq. 30 in ITakada & Jainll2003ah : 



Xs 



r,) h ih.l2 /„)■= / dx-^^JdM^ j k M {h)ku{h)---hu{ln). (23) 







The 1-halo term can be computed by a two-dimensional numerical integration, as the Fourier transform of the lensing field due to an NFW 
halo, km (I), can be computed analytically for a given halo with mass M and at redshift z. 

As can be found from Eqs. (T]}, GU and (|20j, some of the non-Gaussian terms can be further simplified; e.g., one of the non-Gaussian 
terms in the power spectrum covariance (Eg. II lb can be simplified as 

1 1 '''' 1 T(l, —I, l' , —l') ~ -^-T lh (li, k, lj, lj), (24) 



^J\l\ &i A ^J\l'\e h A ^) ' Q 

where we have assumed that the lensing trispectrum does not largely change within the multipole bin Al around the bins U and lj. Thus the 
above approach allows a faster computation of the power spectrum and bispectrum covariances. 



4 RESULTS: COMPARISON WITH RAY-TRACING SIMULATIONS 



4.1 Ray-tracing simulations 

To study the lensing covariance matrices, we use 1000 realizations of ray-tracing simulations for a ACDM model in ISato et al. I J2009I) . 
Although the simulations were done for various source redshifts ranging from z s = 0.6 to 3, we use the outputs of z s = 1 in this paper. In 
brief, each realization has area of 5 x 5 square degrees (Q, s — 0.0076 steradian) in square shaped geometry. The ACDM model adopted is 
characterized by cosmological parameters: the matter density parameter Q m — 0.238, the baryon density parameter flf, = 0.042, the initial 
spectral index n s — 0.958, the amplitude of the density fluctuations <jg = 0.76 and the Hubble constant of Ho = lOO/i km s _1 Mpc _1 
with h = 0.732. The linea r matter power spectrum used to set the initial conditions of N-body simulations is computed by the public code 
CAMB dLewis et alf 2000). It was shown that the ray-tracing si mulations are reli able to within a 5% accu racy up to multipole I ~ 6000 for 
the power spectrum and up to I ~ 40 00 for the bispect rum. See lSato et alj J2009D and lValageas et al.l (2012J) for more detailfl 

As can be found from Fig. 1 in Satoetal. (2009), the ray-tracing simulations were done in a light cone of area 5x5 square degrees, 
viewed from an observer position (2 = position). The projected mass density fields in intermediate redshift slices were generated from N- 
body simulations whose size include the region bigger than the volume covered by the light cone. Hence the lensing fields have contributions 
from the mass density field of scales outside the ray-tracing simulation area, although, exactly speaking, the modes outside the N-body 
simulation box were not included. Thus the ray-tracing simulations allow us to study the HSV effect on the covariance matrices. 



4.2 Measuring power spectrum, bispectrum and the covariance matrices from simulations 

In each simulation realization, the lensing convergence field, k(0), is given on 2048 x 2048 grids. We used the FFT method to compute 
the Fourier-transformed field, k(l). The fundamental mode of the discrete Fourier decomposition is If = 72(= 27t/5°) and the Nyquist 
frequency is ~ 70000, which is large enough compared to the resolution limit of the N-body simulations. We use multipole bins that are 
logarithmically spaced by A In I — 0.3 (A log I ~ 0.13), which significantly reduces the number of triangle configurations (the number of 
different bispectra) we need to consider, compared to Al = 1 as in the CMB case. We consider 16 multipole bins in total; the 1st bin is in 
the range I = [72, 97.2], and the 16th bin in the range I = [6481.2, 8748.8], so Uin = 72 and Z max = 8748.8. 

The power spectrum of the z-th multipole bin U, P(h), is estimated from each realization by computing an azimuthal average of the 
estimator, K^k_^, where |Z| resides in the target bin. We then averaged the estimated power spectra in 1000 realizations to estimate the 
ensemble-averaged power spectrum (corresponding to the power spectrum for the area of 1000 x 25 = 25000 square degrees). Then we 
computed the s catters among the power spectra of 1000 realizations in order to estimate the covariance matrix of the power spectra (see 
Sato et al.ll2009l . for details). Hence the covariance matrix is for an area of 25 square degrees. 

The bispectrum is given as a function of triangle configuration. We use three side lengths (li, 1%, I3) to parametrize triangle configuration, 
where the triangle conditions are given as \lj — lk\ ^ h ^ lj + Ik- Although the multipole bin used has a logarithmically-spaced bin width, 

9 The simulation data is available at |http : / / www . a .phys . nagoya-u ■ ac ■ jp/ ~masanori/HSC/] 
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Figure 3. Illustration describing how to choose three vectors in Fourier space that satisfy triangle configurations within multipole bin widths. The triangle 
configuration is specified by three side lengths which have central values (Ii,l2,l3) and the widths AlnZ = 0.3. The three vertices chosen are used in 
estimating the bispectrum from ray-tracing simulations by averaging the estimator Re[ky Ky ky ] over the triangles (see text for details). First, the first vector 

i'j is chosen from the annulus (shaded region), which is in the upper half plane and has the radius in the range [h mm . Zi.max] (a bin of ii). Then, the second 
vector l' 2 is chosen from the annulus which has position angle in the range arg(i' 1 ) avg(l' 2 ) n and has radius in the range [l 2 mini ^2,max]- For the given 
pair of l' ± and 1' 2 , the third vector l'^ is determined by the triangle condition 1' 3 = — 1' ± — 1' 2 ; if l' s is in the range 2 m j n < ig ^ 'max, the triplet of (Z^, 1' 2 , 1'g) is 
accepted and otherwise discarded. The angle ip±2 is used for discussion in Appendix lAl 

we impose the triangle conditions on the central values of the multipole bins. In addition we impose the condition h ^ h ^ h so that every 
triangle configuration is counted once. For the 16 multipole bins above, we have 204 triangle configurations in total. 

For a triangle configuration that is specified by the side lengths (Zi, I2J3) (with the bin widths), we can estimate the bispectrum from 
each ray-tracing simulation by averaging the estimator Ke[ky ky ky ] over all the triplets Z 2 , Z3) which satisfy the triangle conditions; 
the length of each vector is in the triangle bin such as Zi, m i n ^ l[ ^ Zi, max . Note that, as long as the triplets of 1' 2 , Z 3 ) in Fourier space 
have the same side lengths h,h, h within the bin widths, all the triangles transformed by parallel translation, rotation and parity transform 
are equivalent to yield the same bispectrum due to rotation, parallel translation and parity invariance for a statistically homogeneous and 
isotropic field. In our simulations, the Fourier-transformed convergence field, k(l), is given on 2048 x 2048 grids in Fourier space, where the 
grids are linearly spaced by the fundamental mode, I / = 2n/Q B — 72. To have an efficient computation over 1000 realizations, we first built 
the table of three vectors (grids), (Z'^Z^Z'j), where each triplet satisfies the triangle conditions — l' k \ ^ l' t ^ |Zj + l' k \) and is assigned 
to one of the triangle configurations binned by three side lengths (Zi, I2, Z3). Then we used the same table of triplets for the 1000 realizations 
to compute the average and scatters of the estimated bispectra as a function of triangle configurations. 

To be more precise, we built the table of three vectors (grids) in the way illustrated in Fig. [3] First, we choose the first vector Z' x from 
one of the grids in the upper half of Fourier space, i.e. ^ arg(Z' 1 ) < 7t, by imposing the condition that the length l[ is in the range of 
the multipole bin, l m i n ^ l[ < Z m ax- Then we survey for the second vector 1' 2 from the region where the length is Zm.in ^ 1' 2 < Z max and 
the position angle satisfies arg(Zi) ^ arg(Z 2 ) < n (more precisely, li ^ 1' 2 if and only if arg(Zi) = arg(Z 2 )). For a given pairs of l[ and 
Z 2 , we choose the third vector l' A from the condition Z 3 = — 1[ — 1' 2 and then accept the triplet of (Z' l5 Z 2 , Z3) (otherwise discard it) if the 
length l' a is in the range satisfies the condition Z m in ^ l'z < ^max- Then we assign each set of three vectors, (Z^, 1' 2 , l' s ), to one of the triangle 
configuration bins labelled by (Zi, I2, h) by sorting l[, 1' 2 and 1' 3 in the ascending order. For any of the sets of three vectors chosen in this 
way, two of the three vectors are in the upper half plane of Fourier space, while the remaining one is in the lower half plane. Hence, we miss 
triangles with configurations for which two vectors are in the lower plane and the other in the upper. However, we can recover these triangles 
by just flipping the signs of all three vectors and obtain the same value of bispectrum, due to the symmetry k^ — k* ^, which comes from the 
real condition of the lensing field. Thus we need not count the latter cases, but just twice the number of actually counted triangles to obtain 
-/Vtrip, the number of independent triplets for a survey area of 25 sq. degrees. For some of the following results, we will use the measured 
iVtrip when computing the Gaussian error contributions to the bispectrum covariance (the first term in Eq. I18t . Although Eq. d 1 6b gives a 
good approximation to iVtrip for the limit of Zi 3> 1, the iVtrip directly estimated from the simulation properly takes into account the effect 
of discrete grids in the lensing map. 

Using the table of three vectors (l[ , 1' 2 , 13) obtained in the process above, we estimate the bispectrum by averaging Re[ky ky ky ] from 
each realization. Note that the value of ky is taken from the field at the grid that has two coordinate components (l' ix , l' iy ), not from the field 
at the grid denoted by the arrow of the vector l[ in Fig. [3] Although we take the real part of ky ky ky for the average, this is not essential, 
because the estimated bispectrum satisfies the real condition to a very good approximation after the average over many triangles. 

The dimensions of the resulting covariance matrices are: 16 x 16, 204 x 204 and 16 x 204 (or 204 x 16) for the power spectrum 
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Figure 4. The lensing bispectrum for equilateral triangle configuration as a function of the side length I, where the multipole bins are logarithmically spaced 
by A In I = 0.3. The data points with error bars are the bispectrum measured from the 1000 ray-tracing simulations with the source redshift z s = 1. Each 
simulation covers an area of25deg 2 .The error bars show the scatters of the 1000 realizations, which therefore correspond to the measurement errors expected 
for the area of 25 deg 2 . The solid curve show the halo model prediction, while the dotted, short-dashed and long-dashed curves show the 1-, 2- and 3-halo 
term contributions to the bispectrum. For the 3-halo term, we used the tree-level perturbation theory prediction. Note that the bispectra for I > 4000 may be 
affected by the resolution limit of the ray-tracing simulations. 



covariance, the bispectrum covariance and the cros s-covariance, respecti vely. The 1000 realizations are okay to compute the 204x204 
elements of the covariance matrix (see Appendix of Taka hashi et alj|2011r) . although a larger number of the realizations are ideally needed 
for a more accurate estimate of the covariance matrix. This is definitely true in a case that more triangle configuration s are considered, e.g. 
as in the case of lensing tomography where different redshift slices are further needed to include ( Takada & JainlEo04 ). Hence an analytical 
approach to compute the covariance matrices is useful. 



4.3 Comparison of the simulation results and the halo model predictions 

Fig- plots an example of the measured bispectrum (points with error bars) and the halo model prediction (lines) for equilateral triangle 
configurations. The halo model fairly well agrees with the simulation results, although it under-estimates the bispectrum amplitude around 
/ ~ a few 100 and ov er-estimates at I > 10 00. An improvement of the halo model accuracy may be available by refining the halo model 
calculation, as done in lValageas et alj ( 120121) . However, in this paper, we do not pursue this possibility, because the primary purpose of this 
paper is to study the bispectrum covariance and the information content of the lensing bispectrum, where the inaccuracy of the halo model 
predictions does not have a significant impact. 

In Figs. f5] — [7] we study the covariance matrices of the lensing bispectra for some representative triangle configurations of the 204 
triangles. Note that, in these results, the two triangle configurations of the covariance have the same length(s) for at least one side length 
(e.g., h = I'x). Hence the covariance terms O(BB) and O(PT) in Fig.[T]are not vanishing for the covariances shown in these figures. First, 
in Fig. [5] we study the diagonal terms of the bispectrum covariance matrix for equilateral triangles, Cov[B cq (/), B cq (l)], as a function of the 
side length I. The points show the simulation results. The jagged feature at small I bins is due to the effect of discrete pixels in the lensing 
maps which causes the number of triangles to be non-smooth in the small / range. The different curves are the halo model predictions for the 
covariance matrix, which are computed based on Eqs. j 1 8b and J 1 9b . The dotted curve shows the Gaussian error contribution that scales with 
P(/) 3 , where we used the number of triangles directly computed from the simulated lensing map in computing the Gaussian term in Eq. d 1 8b - 
The long dot-dashed, short dot-dashed, long dashed and short dashed curves are the non-Gaussian terms in Eq. J 1 8b . For these calculations, 
we used the P(l) and B cq (l) directly estimated from the 1000 ray-tracing simulations, while we used the halo model in ^ 13.51 to compute 
the the higher-order functions, which are needed to compute the non-Gaussian terms. The solid curve is the total power of the covariance 
matrix, which is computed by summing all the different curves of the model predictions. The figure clearly shows that the non-Gaussian 
errors become significant at I > a few 100, and the model predictions are in good agreement with the simulation results, if including the 
HSV term. The HSV term is do minated over the other terms at I > 1000. These findings are similar to the results in the power spectrum 
covariance (see Figs. 5, 6 and 7 in Sato et all 20091) . 
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Figure 5. Diagonal terms of the bispectrum covariance matrix for equilateral triangle configurations as a function of the side length I, as in the previous figure. 
The covariance amplitude is shown in the unit of i 6 Cov[B e q(0i ^eq(')]> since i 3 B e q(0 gives the contribution to the skewness (^ 3 )- The points are the 
simulation results estimated from the scatters of 1000 simulations. The other curves are the halo model predictions, which are computed based on the method 
described in § 13.21 The dotted, long dot-dashed, short dot-dashed, and long dashed curves are the contributions that are proportional to P 3 , B 2 , PT and Ps, 
respectively (also see Fig. [T}. The contributions of 0(P 3 ) are the Gaussian error contributions, and the other terms are the non-Gaussian pieces. For these 
calculations, we used the power spectra P(l) and the bispectra B(l) directly estimated from the simulations. The short dashed curve is the HSV contribution, 
which dominates over other terms at multipole bins, I > 2000. The solid curve is the total contribution, the sum of all the terms. 




Figure 6. Similar to the previous plot, but for different triangle configurations. Left panels: The covariance matrix for isosceles triangle configurations, 
B>(h, h, h), with l\ = 12- The different panels are for different side lengths £1 and I2; within each panel, the covariance matrix is shown as a function of 
£3. Note that the covariance matrix is shown in the multipole range where the triangle conditions \lj — Zj.| ^ i; ^ lj + l k are satisfied, but the condition 
^ I2 ^ ^3 is not imposed. Right panels: The covariance matrix elements between the bispectra of isosceles and equilateral triangles. The Gaussian terms of 
P 3 denoted by the vertical dotted lines appear at a particular value of I3, where the two triangles become the same, i.e. equilateral triangles with l\ = I2 = I3, 
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Figure 7. Cross-covariance between the power spectrum and the bispectrum of equilateral triangles, Cov[P(i), B cq (l)], as a function of /. The covariance 
amplitude is plotted in the units of Z 5 Cov[P, B eq ], because l 2 P and l 3 B contribute to ^k 2 ^ and (k 3 ^), respectively. Note that there is no Gaussian error 
contribution to the cross-covariance, because it arises from the 5-point correlation functions. The different curves are the model predictions that are computed 
based on the method in § 13.31 

Fig.[6]shows the covariance matrices for isosceles triangle configurations (left panel) and the off -diagonal components between different 
triangle configurations, equilateral and isosceles triangles here. For this plot, we did not use the condition h ^ h ^ ^3 for presentation 
purpose, but all the triangles shown here are indeed included in the covariance matrix elements we will use in the following analysis. Even 
for these more general triangle configurations, the model predictions including the HSV effect are in good agreement with the simulations. 
One may find that the covariance amplitudes or some non-Gaussian terms peak at some particular value of I3 in the x-axis. This happens 
when the isosceles triangles become to have higher symmetry, equilateral triangles; the number of independent triangles become smaller for 
higher-symmetry triangles, leading to the greater covariance amplitudes. 

In Fig. [7] we study the cross-covariance between the power spectrum and the bispectrum of equilateral triangles, Cov[P(Z), B cq (l)], as 
a function of the multipole bin I. There is no Gaussian error contribution to the cross-covariance, as it arises from the 5-point functions of the 
lensing field. The figure again shows that the model predictions including the HSV term can well reproduce the simulation results. 

In Fig.[8] we compare the halo model predictions with the simulation results for all the matrix elements of the power spectrum covariance, 
the bispectrum covariance and the cross-covariance, in one figure. To do this, we use the correlation coefficients of the covariance matrices 
defined as 



Cov[Xi,Yj] 



v / Cov[X 1 ,X 1 ]Cov[Y ] ,Y j \ 



(25) 



where X and Y are the power spectrum or the bispectrum, and the subscript i or j in X or Y denote the z-th multipole bin or the i-th triangle 
configuration; Xi — P{U),Xi = B(h), and so on. The diagonal components ru = 1 by definition. If rij = 0, it means no correlation 
between the spectra Xi and Yj, while the the higher values of mean stronger correlations. For illustration purpose, we use the following 
indices of the 204 triangles so that the different triangles are indexed in increasing order of Z3 : 

M^i^h^h) = (M,l)> 

(1,1,2), (1,2,2), (2,2,2), 

(1, 1, 3), (1, 2, 3), (1, 3, 3), (2, 2, 3), (2, 3, 3), (3, 3, 3), 
(1,3,4),.-., 



(1, 16, 16), ■ ■ ■ , (14, 14, 16), (14, 15, 16), (14, 16, 16), (15, 15, 16), (15, 16, 16), (16, 16, 16), 



(26) 



where we have used the 16 logarithmically-spaced multipole bins of I. Note that, for a given 23-bin, the other multipole bins (li, I2) are 
listed in increasing order of h (h ^ I2 for each triangle index). With this triangle index, the higher-index triangle configuration corresponds 
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Figure 8. Correlation coefficients r XY (defined by Eq. [25) for the power spectrum and bispectrum covariance and the cross-covariance between the power 
spectrum and the bispectrum, where X, Y = P or B and the indices i or j denote the multipole bin or the triangle configuration, e.g. Xi = B(li). Note that 
the diagonal elements r xx = f by definition. The upper-right matrix elements show the simulation results, while the lower-left elements show the halo model 
predictions with the HSV effect. The upper-left square-shaped panel (16 X 16 elements) shows rf p j for the power spectrum covariance. The lower-right 
panel (204 X 204) shows the bispectrum covariance matrix r BB . The upper-right or lower-left rectangular-shape panels (16 X 204 or 204 X 16) show the 



cross-covariance matrix r - 



or r*r- . With increasing the multipole bins, the correlation coefficients have larger values and approach ~ 1. 



to the triangles having higher multipoles or larger side lengths. The figure shows that the halo model well reproduces the two-dimensional 
features of the covariance matrices seen from the simulationf°l. Note that, for the bispectrum covariance, most of the off-diagonal terms have 
non-Gaussian contributions from the terms O(Pe) (see Fig.QJ and the HSV effect, while the other terms all vanish, in contrary to Figs.Q] 
and[6] because the the two triangles are in different shapes. The correlation coefficients become greater at higher multipoles, almost rij ~ 1, 
meaning significant correlations between the different spectra (power spectra, bispectra and the cross-covariance). 

For comparison, Fig.[9]shows the results if ignoring the HSV effect in the halo model predictions (the simulation results are the same to 
the previous figure). It is clear that the model predictions do not reproduce the simulation results. 



10 Note that, in Fig. [8] we fully used the halo model to compute the covariance matrix elements including the power spectrum and bispectrum, unlike in 
Figs.|5]|6]and|7] 
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Figure 9. Similar plot to the previous figure, but the halo model predictions without the HSV effect are shown (the lower-left matrix elements). Note that the 
simulation results (the upper-right elements) are the same as in the previous figure. It is evident that the halo model predictions are well below the simulation 
measurements for large multipoles owing to the missing HSV terms. 



4.4 Information content of the lensing bispectrum 

As we have studied, the non-linear structure formation induces non-Gaussian errors in the weak lensing field, provoking significant corre- 
lations between the power spectra of different multipoles and the bispectra of different triangle configurations as well as significant cross- 
correlations between the power spectra and the bispectra. Then a more fundamental, important question arises: how much do the lensing 
bispectra carry additional information to the lensing power spectrum? Can joint measurements of the power spectra and bispectra recover the 
information content of the Gaussian field, which the primordial density field of structure formation should have had as in the CMB case? In 
this section, we address these questions. 

A useful quantity to quantify the impact of the non-Gaussian errors is the expected signal-to-noise ratio (S/N) for a measurement of the 
lensing power spectra an d bispectra for a given survey th at is characterized by its area and shot noise parameters. The S/N is sometimes called 
the information content l Tegmark et al Jfl997h (also see lTakada & Jain 2009, and references therein). For the power spectrum measurement, 



the S/N is defined as 

E p(k)[c p ];-p(h), (27) 

!»,( 3 <imax 

where the summation indices i,j run over multipole bin indices up to a given maximum multipole / max , and [C p ] -1 is the inverse of the 
power spectrum covariance matrix (Eq.QT|for the expression). The S/N is equivalent to a precision of determining the log of the power 
spectrum amplitude parameter, In A, from the power spectrum measurement up to a given maximum multipole Z ma x, assuming that the shape 
of the lensing power spectrum is perfectly known. The S/N is independent of the multipole bin width as long as the bin width is fine enough 
to capture the shape of the lensing power spectrum (on the other hand, recall that the relative strength of the non-Gaussian errors to the 
Gaussian errors in each covariance matrix element depends on the multipole bin width assumed). 

Similarly, the S/N for the bispectrum measurement or the information content about the bispectrum amplitude parameter is defined as 



(I) 



sV s = E B A C X B ^ (28) 



where the summation indices i, j run over triangle configurations, and we include all the triangle configurations whose side lengths are up 
to a given maximum multipole Z max . The (S/N)b gives an estimate on the precision of constraining the amplitude parameter of lensing 
bispectra, assuming that the dependence on triangle configurations is perfectly known. Again, as long as the multipole bin width or the steps 
between different triangle configurations are fine enough to capture the dependence of the lensing bispectra on triangle configurations, the 
(S/N) b is independent of the multipole bin width. 

We also consider the S/N for a combined measurement of the lensing power spectra and bispectra up to a given Z max . In the presence of 
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Figure 10. Left panel: Cumulative signal-to-noise ratios (S/N) for the power spectrum (P), the bispectrum (B) and the joint measurement (P + B) for a 
survey area of 25 deg 2 and source redshift z s = 1. It is shown as a function of the maximum multipole £ max , where the power spectrum and/or bispectrum 
information are included over i m ; n ^ £ ^ 'max (see Egs. l27l|28l and |3TV Note that we set Z mm = 72 and did not include the shape noise contamination here 
- it is shown in the next figure. The circle, triangle and square symbols are the simulation results, computed from the 1000 realizations, for P, B and P + B 
measurements, respectively. The short-dashed, long-dashed and solid curves are the halo model predictions. Adding the bispectrum to the power spectrum 
does increase the S/N amplitude, e.g. by about 50% at £ m ax — 10 3 . For comparison, the thin dotted curve shows the S/N for the power spectrum for 
the Gaussian field, which the primordial density field should have contained. Right panel: The thinner curves are added to the left panel to show the model 
predictions without the HSV contribution. For I > 1000, the HSV contribution lowers the S/N significantly. 



the non-Gaussian errors, the total S/N is not simply a sum of the S/N's of the power spectra and the bispectra due to the cross-covariance, 
because the two spectra are not independent. To study this, we first define the data vector for the joint measurement as 

D = {Px, P 2 , ■ ■ ■ ,P rib ,B 1 ,B 2 , • ■ • , B Itrlang , max } . (29) 

The covariance matrix for the data vector D is given as 

/ f~iP s~tPB 



C PB C B 

where the C PB is the cross-covariance between the power spectrum and the bispectrum. Then, the total S/N for the combined measurement 
is similarly defined as 



(I) 



SA = Y Di [C p+B ] 7. 1 Dj. (31) 

P+B 1 U l 



Fig.[l0]shows the expected S/N's for measurements of the power spectra and the bispectra for a survey area of 25 square degrees (i.e. 
the area of the ray-tracing simulation), as a function of the maximum multipole i max up to which the power spectrum and/or bispectrum 
information are included. Note that the minimum multipole is fixed to l m i n — 72, and we did not include the shot noise contamination to the 
error covariance matrices, so the results solely correspond to the cosmological information contents. The circle, triangle and square symbols 
are the simulation results for the S/N's of the power spectra, the bispectra and the joint measurements, respectively, which are computed 
using the 1000 realizations. The thick short-dashed, long-dashed and solid curves are the halo model predictions. First of all, the lensing 
bispectra add a new information content to the power spectrum measurement. To be more quantitative, adding the bispectrum measurement 
increases the S/N values by about 50% for Z max ~ 10 3 compared to the power spectrum measurement alone, where Z max ~ 10 3 or a few 
10 3 are the target maximum multipoles for the upcoming weak lens surveys. This improvement is equivalent to about 2.3 larger survey area 
for the power spectrum measurement alone; that is, the same data sets can be used to obtain the additional information, if the bispectrum 
measurement is combined with the power spectrum measurement. Secondly, the halo model predictions are in nice agreement with the 
simulation results. Note that the total S/N for the joint measurement (P + B) is close to the linear sum of the S/N values {{S/N)p 
an d (S/N)b), not the su m of their squared values (S/N) 2 , due to the significant cross-covariance between P and B (see Appendix C 



in lTakada & Bridldl2007l . for the similar discussion). If ignoring the cross-covariance, adding the bispectrum measurement does not much 
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Figure 11. Cumulative S/N with the HSV effect, as in the previous figure, but here showing the forecasts for upcoming weak lensing surveys, the Subaru 

Hyper Suprime-Cam (HyperSC) survey and the Dark Energy Survey (DES) in the left and right panels, respectively. These surveys are characterized by the 

survey parameters: survey area (Sl s ), mean source redshift (z s ), and mean (effective) number density of source galaxies (n g ). We assumed Q B = 1500 sq. 

degrees, z s = 1, and n g = 20 arcmin -2 for the HyperSC survey and Sl s = 5000 sq. degrees, z s = 0.7, and n g = 10 arcmin -2 for the DES survey. We set 

cr e = 0.22 for the rms intrinsic ellipticity per component for both the surveys. The upper and lower plots in each panel show the results without and with the 

shot noise contamination. The thin curves in the upper plot of the left panel shows the S/N values obtained by scaling the results for 25 sq. degrees in Fig. 1101 
1/2 

assuming S/N oc SJ S ' . The thin curves in the upper plot of the right panel is similar, but obtained by scaling the HyperSC results in the left panel to the 

1/2 

results for 5000 sq. degrees assuming S/N oc Si^ and the source redshift z 3 = 1. The lower plots in each panel show the results including the shot noise 
contribution to the covariance. Clearly shot noise removes information at high I and flattens the S/N curves above I > 1000. 

improve the S/N value (only by 5% or so). Hence it is important to take into account how the power spectrum and the bispectrum are 
correlated for the underlying true cosmology. 

For comparison, the dashed curve shows the S/N values for the Gaussian case, which is the full information content of the lensing field 
;/the lensing field is a random Gaussian field as in the CMB case or the primordial fluctuation fields. The figure clearly shows that, even if 
the lensing field originates from the primordial Gaussian field, the joint power spectrum and bispectrum measurement does not recover the 
full information content of the Gaussian field after the nonlinear structure formation processes. This implies that the higher-order functions 
beyond the bispectra are also important to recover the information content. Or the Gaussian information content or the initial memory can 
not be fully recovered; some of the information content may be lost after the nonlinear structure formation. However, one caution is needed 
to interpret the results in Fig.[l0] The thin curves in the figure show the halo model predictions if ignoring the HSV effect in the covariance 
matrices. This is the case that there is no fluctuations of scales outside the lensing field of 25 deg 2 ; or the Fourier modes in the region obey 
periodic boundary conditions even if the field is highly non-Gaussian, although this does not hold for our Universe. In this case, the S/N 
values are about 25, 11, 18 for the Gaussian case, the power spectrum (P) and the joint measurement (P + B), respectively, for i max ~ 10 
(more exactly the bin of i max = 1245). Thus, the power spectrum contains about 44% of the Gaussian information, and the joint measurement 
can recover about 72% of the information, almost factor of 2. Thus the small S/N value for P + B in Fig. [Toj compared to the Gaussian 
case, is mostly due to the HSV effect. As discussed in § 13.41 the HSV effect alters the overall amplitude of the power spectrum or bispectrum, 
and preserve their shapes. Hence the HSV effect may give the worst case degradation for the amplitude parameter, but may not cause any 
serious degradation in parameters that are sensitive to the shapes of the lensing spectra than naively thought. Thus a genuine impact of the 
HSV effect on the parameter estimation needs to be further studied and this is our future work. 

In Fig. II Hand Table [7] we show the S/N's expected for the upcoming wide-field weak lensing surveys, the Subaru Hyper Suprime- 
Cam (HyperSC) survey and the Dark Energy Survey (DES), which are characterized by survey area, the mean source redshift and the mean 
number density of source galaxies: fi s = 1500 sq. degrees, z s = 1 and n g — 20 arcmin -2 for the HyperSC survey, while f2 s = 5000 sq. 
degrees, z s — 0.7 and n g = 10 arcmin -2 for the DES, respectively. Here we employed the halo model to compute the S/N's, and assume a 
circular survey geometry for simplicity; the survey area is given as S~l B — 7t0 2 . The figure and table show that the upcoming surveys promise 
to allow a significant detection of the lensing bispectrum; (S/N) ~ 26 or 29 for the HyperSC or DES surveys, respectively, when assuming 
Z max ~ 2000 and including the shot noise effect. This means that the theoretical predictions for the lensing bispectra need to be accurate to 
a prediction of a few % in the fractional errors for the upcoming surveys. It can be also found that the bispectra add the new information, 
increasing the total S/N by about 20 - 30% compared to the power spectrum alone. This is equivalent to a factor of 1.4 - 1.7 larger survey 
area. 
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Expected cumulative S/N's for the upcoming weak lensing surveys 



Survey 




Subaru HyperSC Survey 






DES 




Zmax ( — ) 


1000 


1000 


2000 


2000 


1000 


1000 


2000 


2000 


shot noise (<r E ) 


w/o 


with 


w/o 


with 


w/o 


with 


w/o 


with 


(S/N) P 


53 


48 


74 


57 


75 


56 


116 


63 


(S/N) B 


19 


16 


35 


26 


33 


20 


59 


29 


(S/N) P+B 


78 (48%) 


64 (33%) 


98 (32%) 


72 (26%) 


103 (37%) 


66 (18%) 


145 (25%) 


73 (16%) 



Table 1. Cumulative S/N's of the power spectrum (P), the bispectrum (73) and the joint measurement (P + B), expected for the upcoming weak lensing 
surveys, the Subaru Hyper Suprime-Cam survey (HyperSC) and the Dark Energy Survey (DES), as in Fig. 1111 Here we consider Z ma x ~ 1000 and 2000 for 
the maximum multipole (more exactly, the bins of Z max = 1245 and 2268). The column denoted by "w/o" or "with" in the row "shot noise" gives the S/N 
values with and without the intrinsic shape noise contribution to the covariances. The percentage in the parenthesis shows the improvement of S/N value for 
the joint measurement (P + B) compared to the power spectrum alone (P). 



Fig. QT|also shows that the HSV effect is significant for the upcoming surveys. The thin curves in the upper plots of the left-panel are 
the S/N's values computed by scaling the S/N's values for 25 deg 2 in Fig. [Unassuming assuming S/N <x Ql^ 2 . Since the covariance terms 
besides the HSV term scale as l/fi s , the differences between the thick and thin curves are due to the HSV term which depends on a survey 
area via the shape of the matter power spectrum convolved with the survey window function (see discussion below Eq. [T4l > It can be found 
that the S/N values are smaller than the naive scaling results, meaning that the HSV effect has a more slowly-decreasing function than 1/Q S . 
The upper plot of the right panel shows the similar plot, but for the DES results obtained by scaling the S/N values of the HyperSC results 
in the left panel. The S/N values are smaller than the naive expectation due to the following reasons. The DES has a typical source redshift 
of z s — 0.7 compared to z s = 1 for the HyperSC survey, and therefore is more sensitive to large-scale structures at lower redshifts, which 
are more evolving and are in more nonlinear regime. Thus the S/N values for the DES are more degraded by the non-Gaussian errors. 

The lower panels in Fig.QT]show the results when further including the contribution of shot noise (intrinsic shape noise) to the covariance 
matrices. The shot noise leads to a saturation of the S/N values of the cosmological signals. However, the effects of the non-Gaussian errors 
and the HSV contributions are important around Z max ~ 10 3 , the target maximum multipole for the planned surveys, because th e systematic 
effects such as the h i ghly nonlinear clustering effect and/or the ba ryonic effect become more significant at the higher multipoles dwhitel2004 



Zhan & Knoxll2004lHuterer & Ta kada 2005 UHuterer et al]2 006). 



4.5 Principal component analysis of the lensing covariance matrices 

A principal component analysis (PCA) of the power spectrum and bispectrum covariance matrices is useful to quantify how the different 
power spectra and/or bispectra are correla ted with e ach other and how many independent modes or triangles contribute to most of the 
information contents jTakada & Jaii]|2009h (also see IScoccimarrol feoOO. for the the 3D bispectrum case). Since the covariance matrix is 



symmetric by definition, it can always be decomposed as 

(32) 

a 

where X is the lensing power spectrum or bispectrum, X x is the a-th eigenvalue or principal component, (S x )~ 1 = (S X ) T , S^Sf k = 
S(^j and S is normalized so as to satisfy ^ (S x ) 2 = 1. The matrix S x t is considered as the projection matrix as it describes how the 
power in the i-th multipole bin or triangle configuration is projected onto the a-th eigenmode. Using this representation, the inverse of the 
covariance matrix is given by [C^]^ 1 = S^(l/\ x ) 2 Sf a . Hence, the S/N values (Eqs.l27lorl28t can be rewritten as 

(#£-E{5?E«5*}' '• 

where Xi is either P(h), B (U) or the joint (P + B). Thus, since (l/\ x ) ^\ S X iXi can be considered as the S/N for the a-th eigenmode, 
the above equation expresses the total S/N (for a given Z max ) as a sum of contributions from the independent eigenmodes. We can then 
re-order the eigenmodes such that the lower eigenmode has the higher (S/N) 2 ; e.g., the first eigenmode has the highest (S/N) 2 . 

Fig.Q2]shows the PCA results for the power spectrum measurement for the Subaru HyperSC-type survey including the shot noise effect 
as in Fig. Qj] This fi gure can be compared with Fig. 5 in Taka da & Jaml (2009), where the HSV effect was not included. The short-dashec 



and dotted curves show the differential (S/N) 2 values of each i-th multipole bins with and without the HSV effect, showing how the total 
(S/N) 2 value increases with adding the i-th multipole bin from the (S/N) up to the (i — l)-th bin. It should be stressed that the power 
spectrum of I ~ 10 3 has a local minimum, due to the significant HSV contribution. 
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Figure 12. Principal component analysis (PCA) for the power spectrum covariance for the 16 logarithmically-spaced multipole bins over 72 I 8748.8, 
for the Subaru HyperSC-type survey with 1500 deg 2 and with the shot noise. Left panel: The long-dashed curve shows the cumulative (S/N) 2 value for 
the power spectrum as in the left-lower panel of Fig. I 111 but for (S/N) 2 instead of S/N (the points denote the central value of each multipole bin). The 
short-dashed and dotted curves show the differential contribution to the (S/N) 2 value at each multipole bin, with and without the HSV effect, respectively. 
The solid curve shows how the (S/N) 2 value is recovered by adding the PCA eigenmodes, where the PCA eigenmodes are ranked in decreasing order of the 
differential (S/N) 2 value (see Eq. l33t . i.e. the higher (S/N) 2 for the lower eigenmode. Including up to about 7 eigenmodes, less than half of the 16 multipole 
bins, can recover about 90% of the total (S/N) 2 value. Right panel: The projection matrix \S a i\ for the first 8 eigenmodes, where the index a denotes the 
a-th eigenmode. The projection matrix for each eigenmode peaks at some multipole bin, but has tails to different multipole bins. In particular, the eigenmodes 
around a few 10 3 have longer tails, reflecting significant correlations between different multipole bins due to the non-Gaussian errors. 



The top dashed curve shows the total (S/N) 2 values as a function of the maximum multipole / max , up to / max = 8748.8 (the 16-th 
multipole bin). On the other hand, the solid curve shows how the cumulative (S/N) 2 value increases by adding the a-th eigenmode (Eq. l33j"1 
The figure shows that, among the 16 multipole bins, including about 7 eigenmodes (about half of the multipole bins) can recover about 90% 
of the total (S/N) 2 . The other eigenmodes are relatively less important due to strong correlations between the different multipole bins. The 
right panel shows the projection matrix \Si a \ for each multipole bin, showing how the neighboring multipole bins are correlated with each 
other. In particular, the power spectra around / ~ 10 3 show significant correlations. 

Fig.[T3]shows the PCA analysis for the bispectrum covariance matrix, where we included the 204 triangles over 72 ^ I ^ 8748.8. First, 
it can be found that the different triangle configurations contribute to the total (S/N) 2 in a complex way. We use the triangle index, given 
by Eq. d26t . and the higher-index triangle corresponds to triangle with larger side length of "£3" value, under the condition h ^ h ^ h. 
However, for a given I3 bin, either of the two side lengths h or I2 can be much smaller than I3, which generally has the smaller differential 
(S/N) 2 . As a result, the differential (S/N) 2 curve has jagged features. The sold curve shows the cumulative (S/N) 2 by adding the new 
PCA eigenmodes, and manifests that about 70 eigenmodes, about one third of the 204 triangles, carry about 90% of the total (S/N) 2 . 

Fig.ll4lshows which triangle configurations contribute to the first 6 PCA eigenfunctions, which therefore have the 6 highest differential 
(S/N) 2 values. More exactly, each panel shows the projection matrix \Si a \ for the a-th eigenmode (a — 1,2,..., 6). In most panels, the 
projection matrix \Si a \ has a single or a few peaks, reflecting that the higher- S/N eigenmode arises mostly from a single or a few different 
triangles. For example, the highest (S/N) 2 eigenfunction arises from isosceles triangles, which have side lengths of a few 10 3 (11- and 14-th 
multipole bins). However, note that we have used rather wide, logarithmically-spaced multipole bins, and most of triangle configurations 
available from Fourier space are close to isosceles triangle configurations. Hence, an exact shape of triangles giving most contributions to the 
total (S/N) 2 value slightly changes with the bin width. If we take a finner multipole bin, the triangles with the highest (S/N) 2 would differ 
from isosceles triangles. 



First, we made the principal component analysis of the power spectrum covariance matrix including up to the multipole bin I = 8748.8 (i.e. 16 X 16 
matrix). Then, we re-ordered the eigenmodes in increasing order of the differential (S/N) 2 value. The solid curve in Fig. I12l shows how the cumulative S/N 
value increases by adding a new a-th eigenmode. 
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triangle ID 

Figure 13. PCA analysis for the bispectrum covariance, for the 204 triangle configurations over 72 ^ I ^ 8748.8, for the HyperSC-type survey as in the 
previous figure. The ir-axis denotes either the triangle index or the PCA eigenmodes. The triangle indices are in the order given by Eq. {26); the larger indices 
correspond to triangles with the larger side length of £3. Note that the triangle parameters are (Zi, £21 ^3) under the condition h ^ I2 ^ £3- The vertical 
shaded regions denote the triangles with the same £3 bin, as indicated by the top label of "£3"-bin values. Upper panel: The short-dashed curve shows the 
differential (S/N) 2 of each triangle configuration, while the long dashed curve shows the cumulative (S/N) 2 as in the left-lower panel of Fig. fTTI but for the 
finer multipole bins (stepped by each triangle configuration). The solid curve shows the cumulative (S/N) 2 as one adds the different PCA eigenmodes. About 
70 eigenmodes out of 204 triangles can recover 90% of the total (S/N) 2 of the bispectrum measurement over 72 ^ / ^ 8748.8. Lower panel: The solid curve 
shows the number of independent triangles available from Fourier modes for the HyperSC-type survey (1500 square degrees), for each triangle configuration 
in the x-axis. 

5 CONCLUSION AND DISCUSSION 

We have studied the covariance matrices of the lensing power spectrum and the bispectrum, by using both 1000 ray-tracing simulations and 
the analytical halo model. We have found that there are significant non-Gaussian error contributions to the lensing covariance matrices; the 
power spectra or the bispectra at high multipoles of > a few 10 2 are highly correlated with each other. In particular, we have shown that 
the mass density fluctuation modes of scales comparable with or outside a survey region cause significant non-Gaussian error contributions, 
which we call the halo sample variance (HSV) effect and have not been fully studied in the literature. With the HSV contributions, the halo 
model predictions accurately reproduce the covariance matrices measured from the 1000 simulations (Figs.[4]-[8]l. 

Then we addressed how the lensing bispectrum carries additional information compared to the power spectrum, by including all the 
triangle configurations available up to the maximum multipole Z max . Adding the bispectrum measurement improves the cumulative S/N 
values by about 20-50% compared to the power spectrum measurement alone, at a few 10 3 for l max . We also consider the cases with 
and without shape noise (Figs. [Tolll H and Table [T}. The improvement in S/N is equivalent to about 1.4 - 2.3 larger survey area for the 
power spectrum measurement alone. Hence, our results show that the same imaging data can be used to improve the constraining power 
of cosmological parameters, if we can combine the power spectrum and bispectrum measurements. We also find that the HSV effect is 
significant, leading to a large degradation in the cumulative S/N values. By using a principal component analysis of the covariance matrices, 
we find that about 1/3 eigenmodes of all the triangle configurations over the range of 10 2 < / < 10 4 (70 eigenmodes compared to the 204 
triangles for the multipole binning we assumed) carry most of the total information for the bispectrum measurement (Fig.ll3t. Thus, our 
results give a quantitative answer to long-standing question of how the bispectrum can be useful and complementary to the power spectrum 
while fully taking into account non-Gaussian errors and all triangle configurations. 

Future surveys such as the Subaru HyperSC or DES surveys allow a significant detection of the lensing bispectra (Fig. II Hand Table[T}. 
However, there are some practical issues for making the bispectrum measurement feasible for future surveys. First, we need to take into 
account effects of survey geometry and masking regions such as those due to saturated bright stars. This can be done by extending the 



© 0000 RAS, MNRAS 000, 000-000 



Information content of WL bispectrum 2 1 




1 

0.5 



a=2 



100 200 
triangle ID 



100 
triangle ID 



100 
triangle ID 



200 




200 



(11 14 14) 



(2 13 13) 




(10 14 14) 



0.5 




1 

0.5 




0.5 



a = 5 



100 
triangle ID 



100 
triangle ID 



100 
triangle ID 



200 



200 




200 



(9 14 14) 



(11 15 15) 




(3 11 11) 



Figure 14. The curve in each panel shows which triangle configurations contribute to each PC A eigenmode for the bispectrum measurement in Fig. [13] We 
considered the first 6 PCA eigenmodes: the differential (S/N) 2 values are higher in order of the left-top, left-middle, right-middle and right-bottom panels. 
More exactly, the absolute values of \Si a \ are plotted, where Si a is the projection matrix of the a-fh eigenfunction onto the i-th triangle configuration. The 
triangle configurations shown on the right side of each panel are the triangle configurations with Si a ^ 0, where the triangle sizes are plotted on the linear 
scale and the thickness of the lines is proportional to the amplitude of \Si a \. The configurations with the largest contribution are labelled in the side length bins 
Clj '3) at top. Note that we are using logarithmically-spacing multipole bins in this paper, so about 3/4 triangles available in the Fourier space are isosceles 
triangle configurations. 



method in Hika ge et alj J201 11) for the power spectrum measurement to the bispectrum case, but has not yet been fully addressed in the 
literature. It is also important to explore a method of cleanly decomposing three-point correlations of E/B modes in the presence of the 
survey window function. An alternative approach is to measure the real-space three-point correlation functions, rather than the bispectrum, 
which does not suffer from the survey effects. However, in this case, there are even more significant correlations between the three-point 
correlations of different triangles, and an estimation of the covariance matrices involve multi-dimensional integrations of the bispectrum 
covariance, if one wants to use the halo model approach. Or a sufficiently large number of simulations may be used, but again it is not 
feasible to construct the covariance matrices for different cosmological models, which will be needed for parameter estimation. Hence, 
whether Fourier- or real-space is more useful/tractable for the three-point correlation measurements is still an open issue. 

Although we focused on the cumulative S/N as a measure of the information content, S/N we defined is just one measure to quantify the 
complementarity of the lensing bispectrum. Specifically, the S/N quantifies the expected precision for determining the amplitude parameter, 
assuming that the shapes of the bispectrum or the power spectrum are perfectly known. Hence, the S/N is not necessarily a good measure 
to quantify the powe r of the bispectra for constraining cosmological parameters that are sensitive to the shape of the bispectrum (also see 
Takada & JaiiJ2009L for the similar discuss i on). H ere we would like to stress a possible good news. Our results for the cumulative (S/N) can 



be compared with Fig. 5 in Takada & Jainl 1 2004 ), where only the Gaussian error contributions to the bispectrum covariance were included. 
The relative (S/N) 2 values between the power spectrum and bispectrum measurements are similar to the results shown in this paper, because 
the HSV effect degrades both the power spectra an d the bispectra simila rly. Even if the bispectrum S/N is smaller than that of the power 
spectrum by a factor of a few for i max ~ a few 10 3 , Takada & Jainl 1 2004 ) showed that adding the bispectrum measurement can significantly 
improve accuracies of parameter estimation from the power spectrum measurement alone, by efficiently breaking parameter degeneracies. 
Thus we can expect similar improvement even with the inclusion of non-Gaussian errors. Exploring the genuine usefulness of the lensing 
bispectrum for cosmology is very interesting - the full forecasts for various cosmological parameters will be presented in future work. 

In particular, to do such parameter forecasts, it would be much more interesting and useful if including tomographic information of the 
lensing field that is available from photometric redshift information of every source galaxies. It has been found that t he lensing tomograph y 
can significantly improve the constraining power by recovering the radial-direction information of the lensing field fe.g. lTakada & J ain 2004). 
However, for the lensing bispectrum tomography, we need to further include all the triangle configurations in multipole and redshift space in 
order to estimate the full potential. For example, if we consider three redshift bin tomography, the total number of triangles is 3 3 x 204 = 5508 
for the same multipole binning as used in this paper. Thus, even 1000 realizations are not enough to reliably estimate the covariance matrices 
of the lensing bispectra. Hence we believe that the analytical formula we developed in this paper are essential to address these issues. 

Perhaps as important as the improvement in statistical precision, the bispectrum enables self-calibration of systematic errors inherent 
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in the lensing tomography measurements such as photometric redshift errors and imperfect galaxy shape measurement. Lensing bispectra 
depend on systematic errors in a different way from the power spectrum, but the two spectra share the same large-scale structure, hence the 
same cosmoiogicai parameters that describe the underlying true cosmology. Thus combini ng the power spectru m and bispectrum measure- 
ments can be used to self-calibrate systematic errors and improve cosmoiogicai constraints fauterer et al.ll20o3) . Again, by fully taking into 
account the non-Gaussian errors, one needs to quantify how well self-calibration works for upcoming lensing surveys 

Planned multi-colour imaging surveys can al so be used to measure o ther cosmoiogicai probes such as the abundance of galaxy clusters 
and the correlation functions of galaxies. Recently, Oguri & Takadal 1 201 lb proposed a new method of using the halo-shear correlations or the 
stacked lensing signals around massive halos as a function of the cluster redshifts, and showed that the stacked lensing tomography allows to 
constrain cosmoiogicai measurements to a high precision. In halo (cluster)-shear correlations, the signals arise from the large-scale structure 
at a particular redshift, the cluster redshift, which gives us a measure of the three-dimensional power spectrum, rather than the projected, 
angular power spectrum. As we have shown, massive clusters are key observables to understanding the HSV effect on the lensing power 
spectra, and it would be very interesting to pursue how we can combine the lensing power spectrum and bispectrum measurements with 
the cluster observables to improve cosmology parameter estimation by calibrating the HSV effect. The formulation we developed in this 
paper can be straightforwardly extended to the halo-shear bispectra such as the halo-shear-shear and halo-halo-shear three-point correlation 
functions. 

Finally, there are several effects that we ignored in this paper and need to be studied. First, we only considered a simple circular-shaped 
survey geometry to compute the HSV effect. As we showed (see § 13.4b . the HSV depends on the survey window function, which in turn 
determines which large-scale density fluctuation modes, outside the survey region, contribute to the error covariances. For a general-shape 
survey geometry, the Fourier-transformed survey window function becomes anisotropic, and causes additional apparent correlations between 
the different spectra, in a two-dimensional way of Fourier space. Hence, numerical simulations would be needed to study this effect. Since 
the large-scale density fluctuations, which contribute to the HSV effect, are at very large-length scales and therefore in the linear regime, it 
may be straightforward to take into account the survey window effects, by using our formulation. Secondly, in this paper, we used the flat-sky 
approach, which is not valid for a very wide solid-angle survey such as LSST For a full-sky survey as the ultimate case, the HSV effect 
is caused by the horizon-scale perturbations, outside the survey region, and the fluctuation modes are at wavenumbers comparable to the 
matter-radiation equality wavenumber fc oq . In this case, the HSV effect decreases more rapidly with the survey volume than other covariance 
terms (oc Q s ) as we discussed below Eq. d 14b . Hence the HSV effect may not be that significant for such an all-sky survey. This is worth 
studying, and the formulation and method we developed in this paper can be used for such a study. 
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APPENDIX A: DERIVATION OF THE BISPECTRUM COVARIANCE 



We derive equations of the bispectrum covariance matrix given in terms of t he lensing spectra (power spectrum, bispectrum and the higher- 
order functions), following the method developed in Takada & Bridle ( 2007). 



Al Discrete Fourier decomposition 

The lensing power spectra need to be estimated from the Fourier-transform of the lensing convergence field, . The Fourier decomposition 
has to be done for modes taken from a finite survey region. For such a finite sky measurement, an infinite number of the Fourier modes are 
not available. Therefore, the Fourier decomposition is by nature discrete, and the fundamental mode is limited by the size of surveyed area, 
If — 2n/Q s , where the survey area is given by fl B — Q 2 (we assume a square survey geometry for simplicity). For this case, the convergence 
field can be expanded using the discrete Fourier decomposition as 

= 5>« e<i * (A1) 
i 

where the summation runs over the combination of integers (n x ,n y ) for I = (2n/Q s )(n x ,n y ). The prefactor l/f2 a is our convention, 
motivated by the fact that the discrete Fourier decomposition has the limit of the continuous Fourier decomposition; (l/f2 s ) reie 1 — > 
fd 2 l/(2n) 2 kie^ for the limit of O s — > oo. Thus, in the discrete Fourier decomposition, the lensing field is expanded by the Fourier 
modes confined within the finite survey region. In other words, this treatment cannot include the modes of scales outside the survey region. 
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For this reason, the discrete Fourier decomposition formulation cannot yield the HSV effect that arises from the density fluctuations of scales 
outside the survey region. 

In the discrete Fourier decomposition, the orthogonal relation for eigen function e is modified as 

' 2 0e l(l ~ lye = a<5f_ r , (A2) 

where the integration range is confined to the survey region, and 8^ is the Kronecker-type delta; 5^ = 1 if I = 0, and otherwise 5^ — 0. 
The orthogonal relation above suggests that the Kronecker delta should be replaced with the Dirac delta function for the limit of O s — s- oo: 

Hence the definitions of the lensing spectra are modified for a finite-area survey from Eq. Q; e.g. 



{ k l k l 2 ) =^ 5 f 1+ l 2 P ^) 

( k h k l 2 k h) = ^ 5 l 1+ l 2+ l 3 B (h,h,h), (A3) 

and similar expressions for the higher-order functions. These expressions have the limits; (k-^^k^^ = Q. s 8^ + ^ P(h) ~> '/ ~ 

(2n) 2 5 D (h + l2)P(h) and so on for an infinite-area limit (fi s — > oo). 



A2 Bispectrum estimator and the covariance matrix 

The bispectrum is given as a function of triangle configurations. We use a parametrization of three side lengths (Zi, Z 2 , h) to specify a triangle 
configuration. The three parameters are enough, because all the triangles transformed by parallel transformation, rotation, permutation and 
parity transformations in Fourier space are equivalent to yielding the same bispectrum in an ensemble average sense for a homogeneous and 
isotropic field as expected for the lensing field. Hence, within the framework of the discrete Fourier decomposition we described above, we 
can define an estimator of the bispectrum: 

6(fu ' 2 ' h) = n,N tIiP lh,i2,h) £ hq * hq * A<7 - {h > h > ' 3) ' (A4) 

where we have introduced the abbreviated notation q 123 = q 1 + q 2 + q 3 and the summation runs over all the grids of q 1 , q 2 and q 3 (3 
dimensional summation) in Fourier space. The function Aq (Zi, Z2, Is) denotes the selection function defined so that Aq l23 = 1 if each 
vector has the target length, h — Ah/2 ^ qi ^ h + Ah/2 (i — 1, 2, 3) as well as the three vectors satisfy the triangle condition, q 123 = 0; 
otherwise Aq = 0. The quantity A^trip is the number of triplets of grids (vectors) (see Fig.O which form the triangle configuration of 
(h, li-, h)- 

As implied from Eq. dA4b and Fig. [3] we need to average the estimator of kq 1 Rq 2 kq 3 over all the triangles that have the side lengths 
of (Zi, h, h) within the bin widths. For the limits of h S> If, the number of independent triplets of a target triangle configuration, Atrip, can 
be estimated as 

N tlip (h,l 2 ,h) = ^Aq 123 (Z 1 ,Z 2 ,Z 3 ), 
<?> 

" 2X {2^/e^ = — 3 hhAhAl 2 A vl2 , (A5) 

where we meant by "independent triplets" different combinations of triplets that form the target triangle configuration, and AZi is the bin 
width around each multipole bin, <^i 2 is the angle bounded by the two side lengths qi and q 2 in Fig. [3] and Aip-12 is the angle extent allowed 
by the multipole bins as we explain below. First, 27rZiAZi/(27t/B s ) 2 is the number of grids in the annulus, which has the radius of Zi with 
width AZi, as in the coefficient in the power spectrum covariance (see Eq.ll2t. Then, for the first grid of the vector q 1 , we can find other two 
vertices (two grids), denoted by the vectors q 2 and q 3 in Fig. [3] so that the two vectors have the target lengths, g 2 = Z 2 and 53 = Z3 within 
the bin widths. Given the bin widths, the number of grids, which are covered by variations in the two vectors q 2 and q 3 , can be estimated as 
(Z 2 A(pi 2 x AZ 2 )/ (27t/0 s ) 2 , where A</? l2 is the variation in the angle ip\ 2 allowed by the variations in the two vectors q x and q 2 . Finally, 
the prefactor 2 in Eq. dA5t accounts for counter-part triplet of grids that are transformed by parity transformation </? l2 — 5* — ip\ 2 . 
As can be found from Figure[3] the angle <pi 2 is given as 
2 , 2 2 

cosv3i 2 = 7: , (A6) 

2qiq 2 

where the two vectors have the target lengths within the bin widths: h — Ah/2 ^ qi ^ h + Ah/2 (i = 1, 2, 3). If varying the side length 
53 by AZ3, we can find the angle variation as 

. . .-r ZgAZg 2Z3AZ3 

A<p 12 ~ (sm<y3i 2 ) — — = — ^=. (A7) 

hh ^2Z^+2Z?Z§ + 2Z^-Zf-Z4-Zf 
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Therefore, substituting this equation into Eq. iA5l leads us to find the number of independent triplets that form the triangle configuration of 
(h, h, l 3 )' 

Wli,fe,*0 - g 2 a (A8) 

71 ^1 ^2 ^3 

The triplet number has a symmetric property under permutation of h o Z2 and so on. As shown in Joachimi et al, 1 20091) . this expression 
gives a good approximation, better than 1% accuracy, to the full-sky expression that is given by the Wigner-3j symbol, for the limits of 
U » 1 (i.e. the fiat-sky limit). 

The ensemble average of the bispectrum estimator dA4t is found to yield the lensing bispectrum: 

■ B{h, fa, Ia)a«f ia8 A 9l23 (h,h, h) 



£l s N ttlp (h,h,l 3 ) ^ 



l 2j l 3 ) J] 4 123 Aq 123 (Zr, i a , i a ) 



iVtrip 

= B(l u l 2 ,l 3 ), (A9) 

where we have assumed on the third line that the lensing bispectrum does not largely change within the multipole bins AZj, and on the forth 
line that the Rronecker delta Sq automatically holds together with the selection function Aq l23 ; i.e., 5q 12 = 1 when Aq 123 = 1. 
The bispectrum covariance is defined as 

Cov [B(l u h, l 3 ), BQ' U l' 2l 1' 3 )] = n „ R vMM'*q'M0^™ V 123 ~ hmi'Mz), (A10) 

top S trip g g , 

where we omitted the arguments in -/V tr i P and Aq such as iV^-ip = ^tri P (^i, ^2, 1' 3 ) f° r notational simplicity. Thus the bispectrum 
covariance arises from the 6-point correlation function. 

The 6-point correlation function in can be further computed as 

(Kq^q^Rq^hq^Rq^Rq^) = ^tP(qi)P(q3)P(q 2 )Sq 12 Sq 31 ,5q 2 , 3 , + 14 perms. 

+QlB(q 1 ,q 2 ,q 3 )B(q' 1 , q' 2 , q' 3 )8q l23 8q l/2 , 3 , + 9 perms. 

+n^P(q 1 )T(q 3 , q[, q' 2 , q' 3 )Sq i2 Sq 3il2r3 , + 14 perms. 

+n s T 6 (q 1 ,q 2 , q 3 , q[,q 2 , q' 3 )8q 1231 , 2 , 3 , , (AH) 
where 5q 1231 , = 8q + q + q + q> and so on. Inserting the 6-point correlation function above into Eq. dAlOt yields 

Cov [B(h,h, l 3 ), B(l[M)] = jJ—JFT- (h,h,l 3 )A q/i Jl' u l' 2 ,l' 3 ) 

trip trip g g , 



X {n s P( qi )P(q 2 )P(q 3 ) [5q l Jq 22 ,8^ 33 ,+ 8^,5^,5^, + 5^,5^,5^, + 3 perms.] 
+B(q 1 ,q 2 ,q' 1 )B{q 3 , q' 2 ,q' 3 )&q 12i ,5q 32 , 3 , + 8 perms. 
+P{qi)T(q 2 , q 3 ,q 2 , q's) 3 ^, 5 q 232 , 3 , + 8 perms. 

+^fT 6 (q 1 ,q 2 ,q 3 ,q' 1 ,q 2 ,q' 3 )8q i23i , 2 , 3 ,^ , (A12) 

where we have dropped terms including the Rronecker delta such as Sq i2 , because these conflict with the triangle conditions q 123 = or 
9i'2'3' = (these imply q 3 = if q l2 = 0). As we explained in Fig.Q] the bispectrum covariance has different contributions that arise 
from the terms proportional to P 3 , O(BB), O(PT) and Pq, respectively. We below show further simplified expressions of each term in the 
bispectrum covariance. 



A2.1 Gaussian error contribution to the bispectrum covariance 

First, we consider the term in Eq. JA12t proportional to the power spectra cubed P :i . Hence we call this term the Gaussian error contribution, 
even though the bispectrum itself is a measure of non-Gaussianity in the lensing field. In fact, many previous works only considered this term 
when studying the lensing bispectrum, e.g. for parameter forecasts, mainly for simplicity. The Gaussian term can be further simplified as 

Cov Gauss = £ J ^^Aq 123 (; 1 ,; 2 ,; 3 )A g , 23 (;!,; 2 ,/ 3 )P(gl)P(g2)P(g3) [8q xx ,4 22 ,S^ 33 , +5perms.] 

t«P trip g 9 , 
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^2A qi23 (h,l2,ls)P(qi)P(q2)P(q 3 ) 



111' lol^ I3I 



3'3 



5 perms. 



Qi 



A/trip 



P(h)P(h)P(l S ) 



s-K ?K rK 

°lil[ hl' 2 l 3 l'. 



+ 5 perms. 



(A 13) 



where we have used the facts that the power spectrum does not largely change within the bin width and that the term including the Kronecker 
delta (the terms in the square bracket) is non-vanishing only if the two triangles of the two bispectra have the same shape. The term including 
th e Kronecker de l ta giv e 1, 2 or 6 for general, isosceles and equilateral triangles, respectively, which corresponds to the factor A in Eq. (28) 
in ' 



i Takada & Jainl 1 2004) . Thus the Gaussian error covariance term contributes only to the diagonal terms of the bispectrum covariance matrix. 
Recalling the fact iVtrip oc l/£l 2 (Eq. lA8t . this contribution scales with survey area as CovGauss oc l/fi s 



A2.2 Non-Gaussian error contributions to the bispectrum covariance 



Other terms in Eq. JA12t are non-Gaussian error contributions to the bispectrum covariance that arise from the higher-order functions. In the 
following, we derive further simplified expressions of each terms. 



A2.2.1 Non-Gaussian terms of 0(B 2 ) We consider the terms in Eq. d A 1 2t that are proportional to the bispectra squared, the terms of 
0(B 2 ). Here we consider the first term of 0{B 2 ) terms in Eq. dA12t as an example as 



~ H B(q 1 ,q 2 ,q' 1 )B{q 3 , q' 2 , q' 3 )5q^, <5q 32 , 3 , Aq 123 00 V 123 (H) 

tnp trip „ „, 

tri P trip „ , 



IV' 

rl P JV trip 



B(h, l 2 , l[)B(l3, l' 2 , 13 



si 



j^-B(h, l 2 , l[)B(l3, l' 2 , 13) 



1 hi' 

tt — 1 B(h,l 2 , h)B(l3, l 2 ,l 3 ) 
A/trip A trip 



5« l , i B(l 1 ,l 2 ,l3)B(l' 1 ,l , 2 ,l'3) 




(I'i) 



n B 2Al' 2 l 2 Aip, 



= s. 



271 



7 B(l 1 ,l 2 ,l3)B(l' 1 ,l' 2 ,l' 3 ). 



(A14) 



In the above calculation, we have used several simplifications using the triangle conditions and the Kronecker deltas. In the second line of 
the r.h.s., the product of Kronecker deltas, Sq + q + q, Sq + q* +q i , is non- vanishing only if I3 = l[, because of the conditions imposed 
by the selection functions Aq 123 (Zi) and Aqr [l'/); i.e. h — Ah/2 sC qi ^ £1 + Ah/2 ^, q 1 + q 2 + q 3 = 0, and so on. In other 
words, the term above is non-vanishing only if the two triangle configurations have the same length on their one side, I3 = l[ in this case 
(see Fig. QJ. Hence, in the second line, we introduced the Kronecker delta 8^ v , and also used the Kronecker delta Sq + q + q, to drop 
the summation over q' x (then we also used the triangle conditions, q' x = — q x — q 2 — <J 3 due to q 1 + q 2 + <J 3 = 0). In the third line, 
we dropped one Kronecker delta Sq + q, + q, because it is automatically satisfied by the selection function Aq^ + qi + q^ . The curly bracket 
is intended to mean that the summation ^2q, q, is done before another summation X^q • ^ n me f° rm li ne ' we computed the summation 
^2q, q, Aq 3+ qi^ + qi^ (h,' 1' 2 , 13), which gives the number of triplets of grids satisfying the selection function. For the limits of q' 2 ,q'3^>lf, 
we can use the similar calculations given by Eqs. dA5t and ( IA8b - With the selection function Aq 3 +q 2 +q 3 ('1/ ^2> '3)- we can find that, 
for a given vector q 3 , the summation gives the number of paired grids (q' 2 , q'3) within the bin widths, l 2 — Al' 2 /2 ^ q' 2 ^ 1' 2 + Al' 2 /2 
and I3 — Al'3/2 ^ q' 3 ^ £ 3 + Al'3/2. According to the similar calculation (Eq. IA7I >. the number of the paired grids is estimated as 
2 x (l' 2 Al' 2 Aip3 2 i ) /(2n/Q B ) 2 , where A(p 32 i is the variation in the angle bounded by the two side lengths 1' 2 and I3 in the triangle of (I3, 1' 2 , 13), 
due to the variations of 1' 2 and I3 within the bin widths. In the fifth line, we used X^q A <7i23 ~ ^trip('i> ^ 2 > 's) ( see Eq. lA5b . In this step, 

using the fact I3 — l[ via the Kronecker delta <5r »/ , we also replaced B(h, h, l'i)B(l3, 1' 2 , 1' 3 ) with B(h, l 2 , l3)B(l' 1 , 1' 2 , £ 3 ), the product of 

3 1 

the original bispectra taken in the covariance calculation. In the sixth line, we further simplified 2(l' 2 Al 2 Aifi3 2 i) /[Ntri P {li, 1' 2 , i 3 )(27t/0 s ) ] 
by using the similar equations to Eqs. JA7b and dA8t and the fact I3 = l[ to obtain the coefficient 2tz/[Q, s I' 1 AI' 1 ]. 

Performing the similar calculations to other terms in Eq. ( IA12b . the terms of 0(B 2 ) in the bispectrum covariance can be reduced to 
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2tt 

Cov[B(!i),B(JO] N G,o(B») = ^-B{h,hM)B{l[,l^l' 3 ) 



^ + IlA^ + 7permS - 



(A 15) 



These terms scale with survey area as Covng,o(b 2 ) ^ The terms contribute to diagonal terms of the bispectrum covariance as well 

as some off-diagonal terms when the two triangle configurations have the same length within the bin widths at least on their one side (see 
Fig.Q]for the diagrammatic illustration). 

A2.2.2 Non-Gaussian terms of O(PT) Next let's consider the terms proportional to 0{PT) in Eq. JA121 . Similarly to the calculation 
in Eq. JA14t . the first term of Eq. l lA12t can be simplified as 

2? H3i H2i ^3; w yn/ u y 2 32'3' ^123 v ^ ^123 

tri P trip 



P(qi )T(q 2 ,q 3 ,q' 2 , q'^, 5^, Aq 123 {U)^ 

1i Q't 



trip trip ii [v^q's J 

5w ,p ai )T( i2) J 3j « 2j i3)^-^- (27t/a)2 



E A Z 

Z / *-123 

" ' U1 F _ ' trip \—'-/—°/ ^ 

" Qj^ P{h)T{hM ^ l ' z) - (A16) 

In the first line of the r.h.s. of the above equation, we used the fact that the term is non-vanishing only if the two triangle configurations have 
the same length on their one side, h = l'i in this case (see Fig. [TJ. In this step, therefore, we dropped one summation over *^2q, by using 

the Kronecker delta 5g + „, and introduced the Kronecker delta <5^ ; , . We take out the trispectrum from the summation assuming it does 
not largely change within the bin widths, because the 4-point configuration is uniquely specified by the two triangle configurations; the 4 
outer circumference side lengths are given by hjh, I2 and and the diagonal length is given by iiO In the second line, we computed the 
number of the modes given by the summation "Y\ n , n , A_ , / . «/ l 2 , £3) as we did in the calculation of Eq. JA14t . The angle tp 12 / 
is the angle bounded by the two side lengths h and l 2 , and A<p 12 i is the variation due to the bin widths. In the third line, we used the fact 
Aq 123 = iVtrip(Zi) and further simplified the coefficient 2 x (7 2 AZ 2 Ai£> 12 /)/A'trip(Z0 asinEq. JA 14t . 
Hence, the terms of O(PT) in the bispectrum covariance are computed for the limits of ^> If as 

Cov [B(h), B(l[)] NQ(HPT) = f J^P(l 1 )T(h,l a M) + Q^7 P ^)r(i 2) Is, l[,l' 3 ) + 7 perms. (A17) 

Again note that all the trispectra in the terms are uniquely specified by the two triangle configurations; the 4 outer-circumference side lengths 
and the diagonal length (see Fig. [TJ. The terms of O(PT) scale with survey area as Covng, o(pt) The terms contribute to the 

diagonal terms of the bispectrum covariance matrix as well as some off-diagonal terms where the two triangles have the same length at least 
on their one side. 

A2.2.3 Non-Gaussian term of O(Pa) Finally we consider the contribution arising from the connected 6-point correlation function in 
Eq. JaTH : 

Cm[B{U), B(^)] ng ,o(p 6 ) = j±- J2 E - ^ 5 Ls^ CO A 1' 123 

rip trip <?, q[ 



1 (27T) 1 



II d2qi Jf[ d2 ^ P ^1i'l2' «3» 9i> <? 2 > l's) A l 12a h, h)^^ (l'x,l' 2 , 1' 3 ) 



n s N tIip Ar t ' rip m 

The connected 6-point correlation function depends on the 6-point configuration in Fourier space. As shown in Fig.[TJ the two triangle parts 
of the connected 6 point configuration are specified by the two triangle configurations of the bispectra. The selection functions Aq and 
Aq' 2 j automatically satisfies the condition of the 6-point configuration in Fourier space: q 123 + q' 123 = 0. The unspecified part is only the 
angle between the two triangles; here we introduced the angle tp, which is the angle between the vectors q 1 and q 2 in Fourier space. Hence, 
the summations and / ]„/ can be replaced with one-dimensional integration over the angle ip in Fourier space. The covariance term of 
O(Pe) scales with survey area as Cov N q, o(p 6 ) This term contributes to both the diagonal and off-diagonal parts of the bispectrum 

covariance. 

12 The two-dimensional trispectrum is uniquely specified by 5 parameters to characterize the 4-point configuration; e.g., the outer-circumference 4 side 
lengths plus the diagonal length. 
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A2.3 Flat-sky formula for the bispectrum covariance 

Summing up all the terms of Eqs. JA13t . JA15t . JA17t and dA18t . the covariance matrix is found to be given as 



Core[B(Ii,l3,fe),B(J!,Jb,f3)] 



In . 



P(h)P(h)P(h) 



<-K cK c-K , cii cit r/i . r/i ril 

"hl'^hl'^lsl't + hl' 1 ht' 3 °hl' 2 +0 hl' 2 "hl'^hl's 

K 
III', 



S K s-K s-K 



tK sK S K 



+—B{h,h,l 3 )B{l' 1 ,l' 2 ,l' 3 ) 



3 perms. 



+ 



hAh 

K 



+ 7 perms. 



£u^h P{h)T{l2M > l ' 2 > l ' 3) + nj^h Pih)T{l2 ' h ' l,lJ,3) + 7 perms - 



+ 



Us 



2tt 



Pe(ii,J2,Z3,ii,i2,i3; , 0)- 



(A 19) 



However, we should again note that these terms are derived based on the discrete Fourier decomposition formulation, and therefore we 
include the Fourier modes confined within the survey region, where the Fourier modes obey periodic boundary conditions. Hence, the non- 
Gaussian error contributions in the above equation account only for the mode coupling between such Fourier eigenmodes. As we have shown 
in the main part of this paper, we also need to include the halo sample variance (HSV) contribution to the non-Gaussian errors, which arises 
from a coupling of the modes confined within the survey region with the mass density fluctuations of scales comparable with or larger than 
the survey region. 



APPENDIX B: CROSS-COVARIANCE BETWEEN LENSING POWER SPECTRUM AND BISPECTRUM 



Based on the discrete Fourier decomposition formulation, the power spectrum estimator can be defined in lTakada & Bri dle (2007) as 

P{1) - o^by £ k « h -* (B1) 

q-gei 

where the summation is over Fourier modes which have the length of I to within the bin width Al. 

The cross-covariance between the power spectrum P(l) and the bispectrum B(li, h,h) is defined as 

Cov [P(l),B(h,h,l 3 )] = jJ^-^-l— £ J2(nqk-qii qi k q2 K q3 )A qi23 (h) - P(l)B(h,l 2 ,h). (B2) 

l B trip y 

Thus the cross-covariance depends on the 5-point correlation function. The 5-point correlation function in the above equation can be further 
computed as 

(kqR-qKq^q^kq^ = n 2 P(q)B(q 1 ,q 2 ,q i )Sq i23 + Q. 2 P(q)B(q, q 2 , q 3 )5q C +q 1 <55q + q 2+(?3 + 8 perms. 

+Q,sP5(q,-q,q 1 ,q 2 ,q3) s q 123 ( B3 ) 
Inserting this equation into Eq. dB2t and using the similar calculation we have used, we can find the flat-sky formula for the cross-covariance 



as 



Cov [P(0, B(h, h, Is)] = 8fc fl P(l)B(h,l 2 , Is) + 2 perms. + ±- j ^ Ps (I, -I, h,h, Is; 1>), (B4) 
where ij) is the angle between the two vectors I and h in Fourier space. 
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